January 7, 2020

238 words 2 mins read

How LinkedIn determines the capacity limits of its services using live traffic

How LinkedIn determines the capacity limits of its services using live traffic

Susie Xia and Anant Rao explain how LinkedIn leverages live production traffic to determine service and resource bottlenecks at scale with a tool called Redliner and how you can use your current architecture to do the same.


Talk Title	How LinkedIn determines the capacity limits of its services using live traffic
Speakers	Susie Xia (LinkedIn), anant Rao (LinkedIn)
Conference	O’Reilly Velocity Conference
Conf Tag	Build resilient systems at scale
Location	New York, New York
Date	October 2-4, 2017
URL	Talk Page
Slides	Talk Slides
Video

Modern web services like LinkedIn are made up of hundreds of microservices running in geographically distributed data centers. Each microservice needs to be wisely allocated capacity to use data center resources efficiently. However, it’s challenging to accurately determine the service capacity limits and provide resource allocation guidance for rapidly growing web services like LinkedIn due to the constantly changing traffic shape, the heterogeneous infrastructure characteristics, and the evolving bottlenecks. Susie Xia and Anant Rao explain how LinkedIn achieves automated capacity measurement and headroom analysis at scale via a system called Redliner, which runs load tests by shifting live user traffic to target service instances in real production environments, helping reduce data center costs, execute proactive capacity planning, and detect performance regressions in development cycles. Susie and Anant also share lessons learned in building and maintaining Redliner and tips on how you can use your current service-oriented architecture to do the same. Topics include:

automated microservice infrastructure linkedin data center performance

comments powered by Disqus

Key big data architectural considerations for deploying in the cloud and on-premises (sponsored by NetApp)

Key big data architectural considerations for deploying in the cloud and on-premises (sponsored by NetApp)

January 1, 2020

When analytics applications become business critical, balancing cost with SLAs for performance, backup, dev, test, and recovery is difficult. Karthikeyan Nagalingam discusses big data architectural challenges and how to address them and explains how to create a cost-optimized solution for the rapid deployment of business-critical applications that meet corporate SLAs today and into the future.

Bulletproof your CI pipeline: Using APM to augment your automated performance testing (sponsored by AppDynamics)

Bulletproof your CI pipeline: Using APM to augment your automated performance testing (sponsored by AppDynamics)

December 16, 2019

As release velocity increases, teams are finding innovative ways to detect and resolve performance issues earlier in the development cycle. Brad Stoner explores how to implement an automated performance testing strategy and explains how leveraging APM (application performance management) tools can reduce time to market while increasing overall quality.

PinTrace: A distributed tracing pipeline

PinTrace: A distributed tracing pipeline

December 14, 2019

Distributed tracing is an emerging field of monitoring distributed systems. Suman Karumuri shares the challenges of building and deploying distributed tracing at scale using PinTrace, one of the largest distributed tracing pipelines. Drawing on real-world examples, Suman explains how traces can be used to understand, debug, and optimize your production workflows.

Scaling a user delivery network for real-time audience targeting

Scaling a user delivery network for real-time audience targeting

December 13, 2019

Adam Shepard peels back the covers on a user delivery networka worldwide distributed data store powering over 80 billion transactions a day at millisecond speed. Join in to learn about eventually consistent data architectures, tiered and hybrid storage layers, and what it takes to manage that much data at scale.

Standing on the shoulders of giants: Unleashing the power of scriptable load balancers

Standing on the shoulders of giants: Unleashing the power of scriptable load balancers

December 13, 2019

Once reserved for companies large enough to write a load balancer from scratch, load balancer middleware can be a powerful tool for scaling applications. Emil Stolarsky and Justin Li explain how Shopify uses scriptable load balancers to solve difficult infrastructure problems, such as sharding across data centers, handling flash sales, and responding quickly to DDoS attacks.

Kubernetes in the Datacenter: Squarespaces Journey Towards Self-Service Infrastructure [I]

Kubernetes in the Datacenter: Squarespaces Journey Towards Self-Service Infrastructure [I]

December 2, 2019

As Squarespaces engineering organization evolved, microservices became an obvious solution to quickly deliver new features and improve infrastructure reliability. We encountered significant challenge …