SWAN: CERN's Jupyter-based interactive data analysis service
SWAN, CERNs service for web-based analysis, leverages the power of Jupyter to provide the high energy physics community access to state-of-the-art infrastructure and services through a web service. Diogo Castro offers an overview of SWAN and explains how researchers and students are using it in their work.
Talk Title | SWAN: CERN's Jupyter-based interactive data analysis service |
Speakers | Diogo Castro (CERN) |
Conference | JupyterCon in New York 2018 |
Conf Tag | The Official Jupyter Conference |
Location | New York, New York |
Date | August 22-24, 2018 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Both CERN and high energy physics (HEP) in general face unprecedented challenges in data storage, processing, and analysis. The experiments of the Large Hadron Collider (LHC) are expected to reach one exabyte of physics data this year. After processing and filtering this data, interactivity takes particular importance in the last phases of analysis, where the final results are produced, namely in the form of plots. Jupyter’s ability to provide notebooks that merge a rich narrative made of code, text, and other media materials allows CERN to offer a web-based service that addresses the needs of the community. This service, called SWAN (an acronym for service for web-based analysis), provides the HEP community with an interactive interface to access data analysis tools, such as the ROOT framework. Moreover, SWAN integrates with CERN’s infrastructure more precisely, with users’ synchronized storage (CERNBox), computing resources, and experiments data and software. Diogo Castro offers an overview of SWAN and explains how the service is being used by researchers and students, both inside and outside CERN. Diogo also discusses the evolution of the service, especially the new SWAN interface, developed on top of Jupyter, which enables both easy sharing among users and connecting to Spark clusters.