November 20, 2019

184 words 1 min read

Breaking Spark: Top five mistakes to avoid when using Apache Spark in production

Breaking Spark: Top five mistakes to avoid when using Apache Spark in production

Spark has been growing in deployments for the past year. The increasing amount of data being analyzed and processed through the framework is massive and continues to push the boundaries of the engine. Drawing on his experiences across 150+ production deployments, Neelesh Srinivas Salian explores common issues observed in a cluster environment setup with Apache Spark.

Talk Title Breaking Spark: Top five mistakes to avoid when using Apache Spark in production
Speakers Neelesh Salian (Stitch Fix)
Conference Strata + Hadoop World
Conf Tag Making Data Work
Location London, United Kingdom
Date June 1-3, 2016
URL Talk Page
Slides Talk Slides
Video

Spark has been growing in deployments for the past year. The increasing amount of data being analyzed and processed through the framework is massive and continues to push the boundaries of the engine. Drawing on his experiences across 150+ production deployments, Neelesh Srinivas Salian explores common issues observed in a cluster environment setup with Apache Spark across five main areas: Attendees can use Neelesh’s observations to improve the usability and supportability of their Apache Spark deployments and avoid such issues in the future.

comments powered by Disqus