October 25, 2019

273 words 2 mins read

Docker for data scientists

Docker for data scientists

Data scientists inhabit such an ever-changing landscape of languages, packages, and frameworks that it can be easy to succumb to tool fatigue. If this sounds familiar, you may have missed the increasing popularity of Linux containers in the DevOps world, in particular Docker. Michelangelo D'Agostino demonstrates why Docker deserves a place in every data scientists toolkit.

Talk Title Docker for data scientists
Speakers Michelangelo D’Agostino (ShopRunner)
Conference Strata + Hadoop World
Conf Tag Big Data Expo
Location San Jose, California
Date March 29-31, 2016
URL Talk Page
Slides Talk Slides
Video

Data scientists inhabit such an ever-changing landscape of languages, packages, and frameworks that it can be easy to succumb to tool fatigue. If this sounds familiar, you may have missed the increasing popularity of Linux containers in the DevOps world, in particular Docker. Michelangelo D’Agostino demonstrates Docker’s many benefits, from making data science code and environments more portable and shareable to making the transition from development to production more seamless to giving data scientists a common basis for collaborating with software engineers, and explains why Docker deserves a place in every data scientist’s toolkit. Drawing on several successfully executed case studies from Civis Analytics, Michelangelo offers an introduction to Docker and DockerHub and explores the tools, techniques, and workflows most applicable to data science. While much of the literature around containers is geared towards DevOps and software engineers, Michelangelo discusses using Docker for Python and R development, for training and scoring predictive models, and for deploying data science dashboards and web apps. Michelangelo also shows how Docker can be used as a collaboration tool and bridge between data scientists and software engineers.

comments powered by Disqus