February 15, 2020

297 words 2 mins read

Deep learning on Apache Spark at CERNs Large Hadron Collider with Analytics Zoo

Deep learning on Apache Spark at CERNs Large Hadron Collider with Analytics Zoo

Sajan Govindan outlines CERNs research on deep learning in high energy physics experiments as an alternative to customized rule-based methods with an example of topology classification to improve real-time event selection at the Large Hadron Collider. CERN uses deep learning pipelines on Apache Spark using BigDL and Analytics Zoo open source software on Intel Xeon-based clusters.

Talk Title Deep learning on Apache Spark at CERNs Large Hadron Collider with Analytics Zoo
Speakers Sajan Govindan (Intel)
Conference Strata Data Conference
Conf Tag Make Data Work
Location New York, New York
Date September 24-26, 2019
URL Talk Page
Slides Talk Slides
Video

Sajan Govindan dives into how CERN applied end-to-end deep learning and analytics pipelines on Apache Spark at scale for high energy physics using BigDL and Analytics Zoo open source software running on Intel Xeon-based distributed clusters. Sajan outlines technical details and development insights with an example of topology classification to improve real-time event selection at the Large Hadron Collider (LHC). The classifier demonstrated very good performance figures for efficiency while also reducing the false-positive rate compared to existing methods. It could be used as a filter to improve the online event selection infrastructure of the LHC experiments, where it could benefit from a more flexible and inclusive selection strategy while reducing the amount of downstream resources wasted in processing false positives. This is part of CERN’s research on applying deep learning and analytics using open source and industry-standard technologies as an alternative to the existing customized rule-based methods. Sajan explores how CERN could quickly build and implement distributed deep learning solutions and data pipelines at scale on Apache Spark using Analytics Zoo and BigDL, which are open source frameworks unifying analytics and AI on Spark with easy-to-use APIs and development interfaces seamlessly integrated with big data platforms.

comments powered by Disqus