November 1, 2019

215 words 2 mins read

Consistent hashing, shuffle sharding, and copysets: Practical tools for controlling failure

Consistent hashing, shuffle sharding, and copysets: Practical tools for controlling failure

Sharding is our go-to tool for handling failures and load balancing, and yet we software engineers rarely think about the quirks of what seems like a largely solved problem or consider the possibility that we can improve on its basic application. Wes Chow reviews some workboth old and recenton controlling failures and adverse distributional effects.

Talk Title Consistent hashing, shuffle sharding, and copysets: Practical tools for controlling failure
Speakers Wes Chow (Cortico at MIT Media Lab)
Conference O’Reilly Software Architecture Conference
Conf Tag Engineering the Future of Software
Location New York, New York
Date April 11-13, 2016
URL Talk Page
Slides Talk Slides
Video

The hash function is the veritable hammer software engineers use to pound a large array of engineering problems into submission. Want to shard your database? Draw a key from your data, hash it, and voilà, instant deterministic load balancing. That’s simple enough, until you look more carefully at distributional effects, failure, and redundancy management. Wes Chow reviews well known (consistent hashing), not so well known (rendezvous hashing), and recent (shuffle sharding, copysets) work that goes a long way toward engineering more favorable failure scenarios. Wes not only covers old and new techniques but also digs a little into the math and attempts to intuit how we could have developed these algorithms.

comments powered by Disqus