Airflow on Kubernetes: Dynamic Workflows Simplified
Apache Airflow is an open source workflow orchestration engine that allows users to write Directed Acyclic Graph (DAG)-based workflows using a simple Python library. Airflow offers a wide range of nat …
Talk Title | Airflow on Kubernetes: Dynamic Workflows Simplified |
Speakers | Daniel Imberman (Senior Software Engineer, Bloomberg), Barni Seetharaman (Senior SWE, Google) |
Conference | KubeCon + CloudNativeCon North America |
Conf Tag | |
Location | Seattle, WA, USA |
Date | Dec 9-14, 2018 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Apache Airflow is an open source workflow orchestration engine that allows users to write Directed Acyclic Graph (DAG)-based workflows using a simple Python library. Airflow offers a wide range of native operators for services ranging from Spark and HBase to Google Cloud Platform (GCP) and Amazon Web Services (AWS). Until recently, the Airflow user experience has been hindered by the need to launch and maintain statically-sized Celery-based Airflow clusters. These clusters were both expensive (over and under-utilization) and complex (multiple points of failure). To address these issues, we developed and published a native Kubernetes Operator and Kubernetes Executor for Apache Airflow. These products allow one-step Airflow deployments, dynamic allocation of Airflow worker pods, full power over run-time environments, and per-task resource management.