Building a robust content recommendation platform for 60 million news readers

Matt Chapman leads a walkthrough of the architecture and open source components that serve Tribune Publishing's content recommendation system, powered by online machine learning at scale. Find out how multiple publications, multiple recommendation algorithms, and one scalable architecture regularly achieve double the performance of the legacy solution.


Talk Title	Building a robust content recommendation platform for 60 million news readers
Speakers	Matt Chapman (mPulse Mobile)
Conference	O’Reilly Software Architecture Conference
Conf Tag	Engineering the Future of Software
Location	New York, New York
Date	February 4-6, 2019
URL	Talk Page
Slides	Talk Slides
Video

In 2016, Tribune Publishing began built an in-house data science team to better leverage its vast datasets with new machine learning and analytics technologies. One of the primary successes of this team was its content recommendation system (“RecSys”), developed entirely in house on top of existing open source systems and new open source libraries created and released by Tribune. Requirements for the RecSys included the ability to perform A/B/n testing against legacy human-edited and algorithmic recommendations, support multiple publications with both shared and exclusive content, support “real-time” online machine learning at scale, scale without limit in the face of traffic spikes, and gracefully degrade when responses can’t be delivered within a given time limit. Matt Chapman leads a walkthrough of the lifecycle of the request from the web browser of a news-reading end user to the backend algorithms that generates up-to-the moment, personalized recommendations for what the user might want to read next. Along the way, Matt reviews the challenges that the team faced, the open source solutions used at each step, and the new framework and libraries developed by the team to make development of algorithms and of the system itself fast, flexible, and scalable. Topics include:

Building a robust content recommendation platform for 60 million news readers

The Data Analytics Platform or How to Make Data Science in a Box Possible

Flyte: Cloud Native Machine Learning & Data Processing Platform

GPU as a Service Over K8s: Drive Productivity and Increase Utilization

Scaling and Securing Spark on Kubernetes at Bloomberg

Why is Cloud-Native Application Development Still So Hard?

OpenSDS: The Autonomous Data Platform for Cloud Native