November 5, 2019

191 words 1 min read

How Spark can fail or be confusing and what you can do about it

How Spark can fail or be confusing and what you can do about it

Just like any six-year-old, Apache Spark does not always do its job and can be hard to understand. Yin Huai looks at the top causes of job failures customers encountered in production and examines ways to mitigate such problems by modifying Spark. He also shares a methodology for improving resilience: a combination of monitoring and debugging techniques for users.

Talk Title How Spark can fail or be confusing and what you can do about it
Speakers Yin Huai (Databricks)
Conference Strata + Hadoop World
Conf Tag Big Data Expo
Location San Jose, California
Date March 14-16, 2017
URL Talk Page
Slides Talk Slides
Video

Apache Spark has become one of the most popular open source projects in big data. But like any six-year-old, Spark does not always do its job correctly and can be hard to understand. Yin Huai looksat the top causes of job failures customers encountered in production, which include resource exhaustion and hitting internal limits within Spark. Yin shares examples of common failures to highlight recent improvements and possible future work. He also shares a methodology for improving resilience: a combination of monitoring and debugging techniques for users.

comments powered by Disqus