Machine learning challenges at LinkedIn: Spark, TensorFlow, and beyond
February 18, 2020
From people you may know (PYMK) to economic graph research, machine learning is the oxygen that powers how LinkedIn serves its 630M+ members. Zhe Zhang provides you with an architectural overview of LinkedIns typical machine learning pipelines complemented with key types of ML use cases.
Now you see me; now you compute: Building event-driven architectures with Apache Kafka
February 11, 2020
Would you cross the street with traffic information that's a minute old? Certainly not. Modern businesses have the same needs. Michael Noll explores why and how you can use Kafka and its growing ecosystem to build elastic event-driven architectures. Specifically, you look at Kafka as the storage layer, at Kafka Connect for data integration, and at Kafka Streams and KSQL as the compute layer.
Microservices with Ballerina: A Programming Language for Network Distributed Applications
February 9, 2020
Ballerina is a programming language designed for network-distributed applications. One of its key objectives is to make providing and consuming services easier by baking concepts such as listeners, se …
Break me if you can: A practical guide to building fault-tolerant systems
January 31, 2020
You built your system, you deployed it, you rolled it up in production, but it's just the beginning. The life of your system just started. Alex Borysov and Mykyta Protsenko outline their practical guide to building fault-tolerant systems with code and design patterns from REST and gRPC ecosystems, role of right product decisions, and importance of a proper communication culture.
Polyglot applications with GraalVM
January 26, 2020
Migrating Apache Oozie workflows to Apache Airflow
January 8, 2020
Apache Oozie and Apache Airflow (incubating) are both widely used workflow orchestration systems, the former focusing on Apache Hadoop jobs. Feng Lu, James Malone, Apurva Desai, and Cameron Moberg explore an open source Oozie-to-Airflow migration tool developed at Google as a part of creating an effective cross-cloud and cross-system solution.
Stream, stream, stream: Different streaming methods with Spark and Kafka
January 6, 2020
NMC (Nielsen Marketing Cloud) provides customers (both marketers and publishers) with real-time analytics tools to profile their target audiences. To achieve that, the company needs to ingest billions of events per day into its big data stores in a scalable, cost-efficient way. Itai Yaffe explains how NMC continuously transforms its data infrastructure to support these goals.