Deploying a scalable JupyterHub environment for running Jupyter notebooks
Jupyter notebooks provide a rich interactive environment for working with data. Running a single notebook is easy, but what if you need to provide a platform for many users at the same time. Graham Dumpleton demonstrates how to use JupyterHub to run a highly scalable environment for hosting Jupyter notebooks in education and business.
Talk Title | Deploying a scalable JupyterHub environment for running Jupyter notebooks |
Speakers | Graham Dumpleton (Red Hat) |
Conference | Strata + Hadoop World |
Conf Tag | Make Data Work |
Location | Singapore |
Date | December 6-8, 2016 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
The JupyterHub application can be used to create a centralized web based environment into which users can log in and get access to their own instance of a Jupyter notebook without requiring them to install any software on their own local computer. This ensures that all users have access to the same environment, and their instance can also be prepopulated with any notebooks or data files. Running a Jupyter Notebook, or even a JupyterHub instance, on a single computer is easy enough, but when you have hundreds of users a single machine is not going to be sufficient to handle everything. JupyterHub provides a pluggable system for spawning Jupyter notebooks and plugins exist for distributing Jupyter notebook instances across multiple machines, but setting up and maintaining the dedicated infrastructure for these can be complicated. Graham Dumpleton demonstrates how to use OpenShift, an enterprise distribution of Kubernetes and general purpose environment designed for deploying web applications at scale across a cluster of machines, and JupyterHub to run a highly scalable environment for hosting Jupyter notebooks in education and business. Along the way, Graham offers an overview of Kubernetes and OpenShift and discusses the advantages of using them over attempting to build out a system yourself from scratch. Topics include: