December 22, 2019

290 words 2 mins read

Jupyter notebooks and production data science workflows

Jupyter notebooks and production data science workflows

Jupyter notebooks are a great tool for exploratory analysis and early development, but what do you do when it's time to move to production? A few years ago, the obvious answer was to export to a pure Python script, but now there are other options. Andrew Therriault dives into real-world cases to explore alternatives for integrating Jupyter into production workflows.

Talk Title Jupyter notebooks and production data science workflows
Speakers Andrew Therriault (City of Boston)
Conference JupyterCon in New York 2017
Conf Tag
Location New York, New York
Date August 23-25, 2017
URL Talk Page
Slides Talk Slides
Video

Until recently, Jupyter notebooks were primarily a tool for individual data scientists working on their own machines. Software engineers used them mostly for doing exploratory or one-off analyses or at most, early-stage development of things that would eventually need to move elsewhere. When it came time to move to production, a project would need to be exported to ordinary Python scripts and worked on like any other code. That mindset is finally changing in many large organizations, as Jupyter has become a first-rate member of enterprise-scale data science stacks. But there’s no one right way to use Jupyter in production. With the ability to run notebooks in the background, data scientists have the option of keeping all of their code in Jupyter while still maintaining the reliability and automation capability of standard Python scripts. But just because you can stay in Jupyter, should you? Andrew Therriault walks you through several different production workflows for combining Jupyter with standard Python scripts, modules, and packages. Using real-world examples from his own experience, Andrew covers the pros and cons of each approach, giving you the knowledge you need to apply to your own projects.

comments powered by Disqus