December 31, 2019

245 words 2 mins read

High-performance enterprise data processing with Spark

High-performance enterprise data processing with Spark

Vickye Jain and Raghav Sharma explain how they built a very high-performance data processing platform powered by Spark that balances the considerations of extreme performance, speed of development, and cost of maintenance.

Talk Title High-performance enterprise data processing with Spark
Speakers Vickye Jain (ZS Associates), Raghav Sharma (ZS Associates)
Conference Strata + Hadoop World
Conf Tag Make Data Work
Location Singapore
Date December 6-8, 2016
URL Talk Page
Slides Talk Slides
Video

Enterprises are getting increasingly comfortable with moving traditional workloads to Spark. However, despite its popularity, Spark remains an esoteric technology within enterprises, and many for whom technology is not their core competence, are wary of building internally managed applications on Spark, in part owing to the lack of a steady talent pool and a fear of budget overruns. As such, there is still a constant struggle to balance the ability to support advanced technology platforms within enterprises with matrix organizations, complex funding channels, and business demands. Vickye Jain and Raghav Sharma explain how they built a very high-performance data processing platform powered by Spark that balances the considerations of extreme performance, speed of development, and cost of maintenance. Vickye and Raghav had to negotiate conflicting objectives such as: Vickye and Raghav also offer an overview of the architecture itself, which consists of several elastic clusters, external orchestrators providing full visibility into jobs, a combination of job servers and traditional Spark applications, and deep integration with technical experts with domain experts for rapid development.

comments powered by Disqus