PinTrace: A distributed tracing pipeline
Distributed tracing is an emerging field of monitoring distributed systems. Suman Karumuri shares the challenges of building and deploying distributed tracing at scale using PinTrace, one of the largest distributed tracing pipelines. Drawing on real-world examples, Suman explains how traces can be used to understand, debug, and optimize your production workflows.
Talk Title | PinTrace: A distributed tracing pipeline |
Speakers | Suman Karumuri (Pinterest) |
Conference | O’Reilly Velocity Conference |
Conf Tag | Build Resilient Distributed Systems |
Location | San Jose, California |
Date | June 20-22, 2017 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Speed improves customer engagement. With the emergence of microservices, it is very common for a single customer interaction, such as loading the home page or querying a search end point, to invoke hundreds of calls to tens of backend services. In this multitenant environment, traditional monitoring and profiling tools can’t tell us why a specific request was slow. Distributed tracing is the only tool available today to trace a request across several systems. The gathered traces allow you to specifically debug how a specific request is processed across the service, understand where a request spent most of its time, and gain insight into why a specific request was slow. Suman Karumuri outlines the architecture of PinTrace, a Zipkin-based distributed tracing infrastructure. Suman shares the challenges of instrumenting and deploying the tracing in a polyglot microservices architecture at scale, a few examples of how Pinterest uses traces from production to debug p99 latency issues and identify unnecessary network calls and performance bottlenecks in the system, and a few distributed tracing use cases beyond performance optimization.