February 19, 2020

270 words 2 mins read

Executive Briefing: The black boxInterpretability, reproducibility, and data management

Executive Briefing: The black boxInterpretability, reproducibility, and data management

The growing complexity of data science leads to black box solutions that few people in an organization understand. Mark Madsen explains why reproducibilitythe ability to get the same results given the same informationis a key element to build trust and grow data science use. And one of the foundational elements of reproducibility (and successful ML projects) is data management.


Talk Title	Executive Briefing: The black boxInterpretability, reproducibility, and data management
Speakers	Mark Madsen (Teradata)
Conference	O’Reilly Artificial Intelligence Conference
Conf Tag	Put AI to Work
Location	London, United Kingdom
Date	October 15-17, 2019
URL	Talk Page
Slides	Talk Slides
Video

The growing complexity of data science leads to black box solutions that few people in an organization understand. You often hear about the difficulty of interpretability—explaining how an analytic model works—and that you need it to deploy models. But people use many black boxes without understanding them…if they’re reliable. It’s when the black box becomes unreliable that people lose trust. Mistrust is more likely to be created by the lack of reliability, and the lack of reliability is often the result of misunderstanding essential elements of analytics infrastructure and practice. The concept of reproducibility—the ability to get the same results given the same information—extends your view to include the environment and the data used to build and execute models. Mark Madsen examines reproducibility and the areas that underlie production analytics and explores the most frequently ignored and yet most essential capability, data management. The industry needs to consider its practices so that systems are more transparent and reliable, improving trust and increasing the likelihood that your analytic solutions will succeed.

reliability management complexity infrastructure data science analytics

comments powered by Disqus

Turn devices into data scientistsat the edge

Turn devices into data scientistsat the edge

December 28, 2019

Todays approach to processing streaming data is based on legacy big-data centric architectures, the cloud, and the assumption that organizations have access to data scientists to make sense of it allleaving organizations increasingly overwhelmed. Simon Crosby shares a new architecture for edge intelligence that turns this thinking on its head.

Architecting a data platform for enterprise use

Architecting a data platform for enterprise use

February 16, 2020

Building a data lake involves more than installing Hadoop or putting data into AWS. The goal in most organizations is to build a multiuse data infrastructure that isn't subject to past constraints. Mark Madsen and Todd Walter explore design assumptions and principles and walk you through a reference architecture to use as you work to unify your analytics infrastructure.

Learning Automation Without Barriers Using Antidote and NRE Labs

Learning Automation Without Barriers Using Antidote and NRE Labs

February 10, 2020

The journey to cloud and automation is a rocky one, with many twists and turns, and often full-on roadblocks.Antidote is a new project based on the idea of "curriculum-as-code", which allows teachers …

Deep learning at scale: Tools and solutions

Deep learning at scale: Tools and solutions

February 6, 2020

Success with DL requires more than just TensorFlow or PyTorch. Angela Wu, Sidney Wijngaarde, Shiyuan Zhu, and Vishnu Mohan detail practical problems faced by practitioners and the software tools and techniques you'll need to address the problems, including data prep, GPU scheduling, hyperparameter tuning, distributed training, metrics management, deployment, mobile and edge optimization, and more.

Architecting a data platform for enterprise use

Architecting a data platform for enterprise use

January 14, 2020

Building a data lake involves more than installing Hadoop or putting data into AWS. The goal in most organizations is to build a multiuse data infrastructure that is not subject to past constraints. Mark Madsen and Todd Walter explore design assumptions and principles and walk you through a reference architecture to use as you work to unify your analytics infrastructure.

Data science transformation: Transforming a traditional wealth manager to a cutting-edge data-driven company

Data science transformation: Transforming a traditional wealth manager to a cutting-edge data-driven company

January 13, 2020

Charlotte Werger outlines the components necessary to transform a traditional wealth manager into a data-driven business, paying special attention to devising and executing a transformation strategy by identifying key business subunits where automation and improved predictive modeling can result in significant gains and synergies.