December 18, 2019

291 words 2 mins read

Building a robust content recommendation platform for 60 million news readers

Building a robust content recommendation platform for 60 million news readers

Matt Chapman leads a walkthrough of the architecture and open source components that serve Tribune Publishing's content recommendation system, powered by online machine learning at scale. Find out how multiple publications, multiple recommendation algorithms, and one scalable architecture regularly achieve double the performance of the legacy solution.

Talk Title Building a robust content recommendation platform for 60 million news readers
Speakers Matt Chapman (mPulse Mobile)
Conference O’Reilly Software Architecture Conference
Conf Tag Engineering the Future of Software
Location New York, New York
Date February 4-6, 2019
URL Talk Page
Slides Talk Slides
Video

In 2016, Tribune Publishing began built an in-house data science team to better leverage its vast datasets with new machine learning and analytics technologies. One of the primary successes of this team was its content recommendation system (“RecSys”), developed entirely in house on top of existing open source systems and new open source libraries created and released by Tribune. Requirements for the RecSys included the ability to perform A/B/n testing against legacy human-edited and algorithmic recommendations, support multiple publications with both shared and exclusive content, support “real-time” online machine learning at scale, scale without limit in the face of traffic spikes, and gracefully degrade when responses can’t be delivered within a given time limit. Matt Chapman leads a walkthrough of the lifecycle of the request from the web browser of a news-reading end user to the backend algorithms that generates up-to-the moment, personalized recommendations for what the user might want to read next. Along the way, Matt reviews the challenges that the team faced, the open source solutions used at each step, and the new framework and libraries developed by the team to make development of algorithms and of the system itself fast, flexible, and scalable. Topics include:

comments powered by Disqus