ROCm and Hopsworks for end-to-end deep learning pipelines
February 18, 2020
The Radeon open ecosystem (ROCm) is an open source software foundation for GPU computing on Linux. ROCm supports TensorFlow and PyTorch using MIOpen, a library of highly optimized GPU routines for deep learning. Jim Dowling and Ajit Mathews outline how the open source Hopsworks framework enables the construction of horizontally scalable end-to-end machine learning pipelines on ROCm-enabled GPUs.
Disrupting data discovery
January 12, 2020
Mark Grover discusses how Lyft has reduced the time it takes to discover data by 10 times by building its own data portal, Amundsen. Mark gives a demo of Amundsen, leads a deep dive into its architecture, and discusses how it leverages centralized metadata, PageRank, and a comprehensive data graph to achieve its goal. Mark closes with a future roadmap, unsolved problems, and collaboration model.
Migrating Apache Oozie workflows to Apache Airflow
January 8, 2020
Apache Oozie and Apache Airflow (incubating) are both widely used workflow orchestration systems, the former focusing on Apache Hadoop jobs. Feng Lu, James Malone, Apurva Desai, and Cameron Moberg explore an open source Oozie-to-Airflow migration tool developed at Google as a part of creating an effective cross-cloud and cross-system solution.