Making recommendations using graphs and Spark
Harry Powell and Raffael Strassnig demonstrate how to model unobserved customer preferences over businesses by thinking about transactional data as a bipartite graph and then computing a new similarity metricthe expected degrees of separationto characterize the full graph.
Talk Title | Making recommendations using graphs and Spark |
Speakers | Harry Powell (Barclays), Raffael Strassnig (Barclays) |
Conference | Strata Data Conference |
Conf Tag | Making Data Work |
Location | London, United Kingdom |
Date | May 23-25, 2017 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Harry Powell and Raffael Strassnig demonstrate how to model unobserved customer preferences over businesses by thinking about transactional data as a bipartite graph and then computing a new similarity metric—the expected degrees of separation (EDS)—to characterize the full graph. EDS is hard to compute on large dataset because of the large number of possible paths between nodes. Harry and Raffael explore different strategies to evaluate EDS in a distributed way in Scala and Spark and propose an estimation approach that is consistent, unbiased, and scalable. They then present results for businesses in Bristol, UK, compare the properties of EDS with familiar graph-based metrics such as PageRank and shortest path, and discuss applications of the technology to other use cases. Harry and Raffael conclude by sharing a simple recommender.