Compliant Data Management and Machine Learning on Kubernetes
Data scientists, machine learning engineers, and researchers are under increasing pressure to provide explanations for how they are processing and managing user data. In particular, the EU's GDPR regu …
Talk Title | Compliant Data Management and Machine Learning on Kubernetes |
Speakers | Daniel Whitenack (Lead Data Scientist and Advocate, Pachyderm) |
Conference | KubeCon + CloudNativeCon Europe |
Conf Tag | |
Location | Copenhagen, Denmark |
Date | Apr 30-May 4, 2018 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Data scientists, machine learning engineers, and researchers are under increasing pressure to provide explanations for how they are processing and managing user data. In particular, the EU’s GDPR regulations taking effect this year are forcing organizations to rethink their data management and processing strategies. In this talk, we will demonstrate a data management and processing methodology/framework that is helping organization deploy compliant workflows on top of Kubernetes. The framework, based on the open source Pachyderm project, gives data scientists automatic tracking of changes to data and of all the various pieces of data and processing that lead to particular results. This, along with access control strategies and anonymization (which will also be discussed in the talk), gives organizations a framework that is easy to manage, scalable for AI/ML workflows, and compliant.