Breaking Spark: Top five mistakes to avoid when using Apache Spark in production
Spark has been growing in deployments for the past year. The increasing amount of data being analyzed and processed through the framework is massive and continues to push the boundaries of the engine. Drawing on his experiences across 150+ production deployments, Neelesh Srinivas Salian explores common issues observed in a cluster environment setup with Apache Spark.
Talk Title | Breaking Spark: Top five mistakes to avoid when using Apache Spark in production |
Speakers | Neelesh Salian (Stitch Fix) |
Conference | Strata + Hadoop World |
Conf Tag | Making Data Work |
Location | London, United Kingdom |
Date | June 1-3, 2016 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Spark has been growing in deployments for the past year. The increasing amount of data being analyzed and processed through the framework is massive and continues to push the boundaries of the engine. Drawing on his experiences across 150+ production deployments, Neelesh Srinivas Salian explores common issues observed in a cluster environment setup with Apache Spark across five main areas: Attendees can use Neelesh’s observations to improve the usability and supportability of their Apache Spark deployments and avoid such issues in the future.