December 22, 2019

290 words 2 mins read

Jupyter notebooks and production data science workflows

Jupyter notebooks and production data science workflows

Jupyter notebooks are a great tool for exploratory analysis and early development, but what do you do when it's time to move to production? A few years ago, the obvious answer was to export to a pure Python script, but now there are other options. Andrew Therriault dives into real-world cases to explore alternatives for integrating Jupyter into production workflows.


Talk Title	Jupyter notebooks and production data science workflows
Speakers	Andrew Therriault (City of Boston)
Conference	JupyterCon in New York 2017
Conf Tag
Location	New York, New York
Date	August 23-25, 2017
URL	Talk Page
Slides	Talk Slides
Video

Until recently, Jupyter notebooks were primarily a tool for individual data scientists working on their own machines. Software engineers used them mostly for doing exploratory or one-off analyses or at most, early-stage development of things that would eventually need to move elsewhere. When it came time to move to production, a project would need to be exported to ordinary Python scripts and worked on like any other code. That mindset is finally changing in many large organizations, as Jupyter has become a first-rate member of enterprise-scale data science stacks. But there’s no one right way to use Jupyter in production. With the ability to run notebooks in the background, data scientists have the option of keeping all of their code in Jupyter while still maintaining the reliability and automation capability of standard Python scripts. But just because you can stay in Jupyter, should you? Andrew Therriault walks you through several different production workflows for combining Jupyter with standard Python scripts, modules, and packages. Using real-world examples from his own experience, Andrew covers the pros and cons of each approach, giving you the knowledge you need to apply to your own projects.

automation code reliability data science jupyter python book

comments powered by Disqus

Humans in the loop: Jupyter notebooks as a frontend for AI pipelines at scale

Humans in the loop: Jupyter notebooks as a frontend for AI pipelines at scale

December 22, 2019

Paco Nathan reviews use cases where Jupyter provides a frontend to AI as the means for keeping humans in the loop. This process enhances the feedback loop between people and machines, and the end result is that a smaller group of people can handle a wider range of responsibilities for building and maintaining a complex system of automation.

Mapping data in Jupyter notebooks with PixieDust (sponsored by IBM)

Mapping data in Jupyter notebooks with PixieDust (sponsored by IBM)

December 21, 2019

Raj Singh offers an overview of PixieDust, a Jupyter Notebook extension that provides an easy way to make interactive maps from DataFrames for visual exploratory data analysis. Raj explains how he built mapping into PixieDust, putting data from Apache Spark-based analytics on maps using Mapbox GL.

Music and Jupyter: A combo for creating collaborative narratives for teaching

Music and Jupyter: A combo for creating collaborative narratives for teaching

December 21, 2019

Music engages and delights. Carol Willing explains how to explore and teach the basics of interactive computing and data science by combining music with Jupyter notebooks, using music21, a tool for computer-aided musicology, and Magenta, a TensorFlow project for making music with machine learning, to create collaborative narratives and publishing materials for teaching and learning.

Developer on the rise: Blurring the line between developer and data scientist with PixieDust

Developer on the rise: Blurring the line between developer and data scientist with PixieDust

November 26, 2019

Ready to dip your toe into data science? Va Barbosa explains why you should start with notebooks and PixieDust, a new open source library that helps data scientists and developers working in the Jupyter Notebook and Apache Spark be more efficient.

How Jupyter makes experimental and computational collaborations easy

How Jupyter makes experimental and computational collaborations easy

December 22, 2019

Scientific research thrives on collaborations between computational and experimental groups, who work together to solve problems using their separate expertise. Zach Sailer highlights how tools like the Jupyter Notebook, JupyterHub, and ipywidgets can be used to make these collaborations smoother and more effective.

How JupyterHub tamed big science: Experiences deploying Jupyter at a supercomputing center

How JupyterHub tamed big science: Experiences deploying Jupyter at a supercomputing center

December 22, 2019

Shreyas Cholia, Rollin Thomas, and Shane Canon share their experience leveraging JupyterHub to enable notebook services for data-intensive supercomputing on the Cray XC40 Cori system at the National Energy Research Scientific Computing Center (NERSC).