December 7, 2019

201 words 1 min read

Making Big Data Processing Portable. The Story of Apache Beam and gRPC

Making Big Data Processing Portable. The Story of Apache Beam and gRPC

Big data applications have been an almost exclusive domain of Java and Scala developers. This not only frustrates engineers who prefer other languages and their ecosystems, but also impedes companies …

Talk Title Making Big Data Processing Portable. The Story of Apache Beam and gRPC
Speakers Ismaël Mejía (Software Engineer, Talend)
Conference KubeCon + CloudNativeCon Europe
Conf Tag
Location Copenhagen, Denmark
Date Apr 30-May 4, 2018
URL Talk Page
Slides Talk Slides
Video

Big data applications have been an almost exclusive domain of Java and Scala developers. This not only frustrates engineers who prefer other languages and their ecosystems, but also impedes companies that already have their business logic written on other platforms from achieving the benefits of reuse when they build data-intensive applications. In this talk we introduce Apache Beam. A unified programming model designed to provide efficient and portable data processing pipelines. We will discuss in detail how Beam achieves portability by relying in two concepts: (1) Runners that translate the Beam’s model so it can be executed in existing systems like Apache Spark and Apache Flink and (2) the portability APIs, an architecture of gRPC services that coordinate the execution of pipelines in containers to accomplish language portability.

comments powered by Disqus