February 26, 2020

209 words 1 min read

Lessons Learned from the Migration to Apache Airflow

Lessons Learned from the Migration to Apache Airflow

Apache Airflow is an open-source tool for orchestrating complex workflows and data processing pipelines.In this talk, Radek Maciaszek will present his learnings from the migration of machine learning …

Talk Title Lessons Learned from the Migration to Apache Airflow
Speakers Radek Maciaszek (Chief Architect, Skimlinks)
Conference Open Source Summit + ELC North America
Conf Tag
Location San Diego, CA, USA
Date Aug 19-23, 2019
URL Talk Page
Slides Talk Slides
Video

Apache Airflow is an open-source tool for orchestrating complex workflows and data processing pipelines.In this talk, Radek Maciaszek will present his learnings from the migration of machine learning and big data processing pipelines to Apache Airflow.Radek will discuss examples of how are they using Airflow to power their company big data infrastructure where they analyze hundreds of terabytes of data. Examples will cover the building of the ETL pipeline and use of Airflow to manage the machine learning Spark pipeline workflow.This talk will cover the basic Airflow concepts and show real-life examples of how to define your own workflows in the Python code. The talk will finish with more advanced topics related to Apache Airflow, such as adding custom task operators, sensors and plugins as well as best practices and both the pros and cons of this tool.

comments powered by Disqus