Jupyter notebooks and production data science workflows
Jupyter notebooks are a great tool for exploratory analysis and early development, but what do you do when it's time to move to production? A few years ago, the obvious answer was to export to a pure Python script, but now there are other options. Andrew Therriault dives into real-world cases to explore alternatives for integrating Jupyter into production workflows.
|Talk Title||Jupyter notebooks and production data science workflows|
|Speakers||Andrew Therriault (City of Boston)|
|Conference||JupyterCon in New York 2017|
|Location||New York, New York|
|Date||August 23-25, 2017|
Until recently, Jupyter notebooks were primarily a tool for individual data scientists working on their own machines. Software engineers used them mostly for doing exploratory or one-off analyses or at most, early-stage development of things that would eventually need to move elsewhere. When it came time to move to production, a project would need to be exported to ordinary Python scripts and worked on like any other code. That mindset is finally changing in many large organizations, as Jupyter has become a first-rate member of enterprise-scale data science stacks. But there’s no one right way to use Jupyter in production. With the ability to run notebooks in the background, data scientists have the option of keeping all of their code in Jupyter while still maintaining the reliability and automation capability of standard Python scripts. But just because you can stay in Jupyter, should you? Andrew Therriault walks you through several different production workflows for combining Jupyter with standard Python scripts, modules, and packages. Using real-world examples from his own experience, Andrew covers the pros and cons of each approach, giving you the knowledge you need to apply to your own projects.