January 14, 2020

227 words 2 mins read

Apache Spark and machine learning on microservices

Apache Spark and machine learning on microservices

Hadoop-based data platforms that power ETL jobs and machine learning pipelines are great examples of monolithic architectures that could be redesigned with microservices. Stepan Pushkarev walks you through building and deploying data processing, reporting services, training, and prediction pipelines as decoupled microservices connected with the rest of the enterprise architecture.

Talk Title Apache Spark and machine learning on microservices
Speakers Stepan Pushkarev (hydrosphere.io)
Conference O’Reilly Software Architecture Conference
Conf Tag Engineering the Future of Software
Location London, United Kingdom
Date October 16-18, 2017
URL Talk Page
Slides Talk Slides
Video

Usually data scientists find it challenging to create a clean REST API; likewise, web developers find it almost impossible to understand machine learning internals. And big data engineers tend to use clunky Hadoop distributions with dozens of tightly coupled tools and then continue to follow this design, developing data processing scripts that communicate through unmanageable state and shared flags. Hydrosphere.io helps data scientists and big data engineers plug into modern reactive and microservices architectures that have already been adopted by traditional web and enterprise teams. Hadoop-based data platforms that power ETL jobs and machine learning pipelines are great examples of monolithic architectures that could be redesigned with microservices. Stepan Pushkarev walks you through building and deploying data processing, reporting services, training, and prediction pipelines as decoupled microservices connected with the rest of the enterprise architecture. Topics include:

comments powered by Disqus