Consistent Hashing: The Ring, the Algorithm, and Why It Wins

How the virtual ring is built from hash functions, why going clockwise is just a binary search, what happens when servers are added or removed, and proof that consistent hashing satisfies all five routing requirements.

April 26, 20266 min read3 / 5

Three intuitive algorithms all failed. The one requirement that broke each of them was either massive data movement on cluster changes or the need to synchronize shared state across Load Balancers. Consistent hashing solves both - by treating the hash output space as a ring rather than a line.

The Hash Function Foundation

Two properties define a hash function:

  1. Deterministic: same input always produces the same output. No hidden state, no randomness.
  2. Bounded output: the input can be anything - a string, a number, an object - but the output always falls within a fixed range.

A simple example: sum the ASCII values of all characters in a string, then take the result modulo 100. Whatever you pass in, you always get a number between 0 and 99. That is a valid hash function.

In practice you would use MurmurHash3, MD5, or SHA-256 - not for security, but because they distribute outputs more uniformly. A 64-bit hash function produces values between 0 and 2⁶⁴ − 1 - approximately 18 quintillion possible values.

Because modular arithmetic wraps around - after 99, you go back to 0 - the output space behaves like a ring. This is not just a metaphor. It is the geometric property that consistent hashing exploits directly.

Hash Function: Deterministic Output as a Ring ExpandHash Function: Deterministic Output as a Ring

Setting Up the Ring

Consistent hashing uses k + 1 hash functions, all sharing the same output space:

  • 1 hash function for users. Each user's ID is hashed once, placing them at one fixed spot on the ring. James always lands at the same position.
  • k hash functions for servers. Each server is hashed k times, placing it at k different spots on the ring.

Those server spots are virtual - not k physical copies of the machine, but k regions of responsibility spread across the ring. The typical value of k is 32 or 64.

Building the Algorithm

The ring is a useful mental model. Inside a computer, it is a sorted array.

For each server, run it through k hash functions. Each hash produces a number in the output range. Store each (hash_value, server_id) pair in an array, then sort by hash value.

With 4 servers (A, B, C, D) and k=3:

Plain text
After hashing all servers through 3 hash functions each: A → (23, A) (40, A) (81, A) B → ( 3, B) (17, B) (99, B) C → (12, C) (55, C) (87, C) D → (20, D) (30, D) (63, D) After sorting: [ (3,B), (12,C), (17,B), (20,D), (23,A), (30,D), (40,A), (55,C), (63,D), (81,A), (87,C), (99,B) ]

Consistent Hashing Ring: Virtual nodes distributed across the ring ExpandConsistent Hashing Ring: Virtual nodes distributed across the ring

Routing a Request

When a request arrives from user Neal:

  1. Hash Neal's user ID with the user hash function. Suppose it produces 33.
  2. Binary search the sorted array for the first entry with hash value >= 33.
  3. The first entry >= 33 is (40, A). Neal's request goes to Server A.

"Walking clockwise in the ring" is identical to "finding the upper bound in the sorted array." Binary search gives this in O(log n·k) time.

If the user's hash exceeds the largest entry, wrap around to index 0 - the ring continues from the beginning.

Because hash functions are deterministic, Neal always hashes to 33. Every Load Balancer runs the same code with the same hash functions and the same k value. Each independently discovers the same live servers via health checks, builds the identical sorted array, and arrives at the identical routing decision. No coordination needed. No shared state.

What Happens When Servers Change

Adding a server: Server X gets k virtual spots distributed across the ring. Each spot falls in an arc previously owned by some other server. Users in those arcs now route to X instead. Everyone else is unaffected. Because X's spots scatter across the ring, it absorbs proportional load from multiple existing servers - not all from one.

Removing a server (or crash): When Server C goes down, all k of its virtual spots are removed. Users pointing at those spots now route to whichever server is next clockwise. Since C's spots were spread across the ring, those users fan out to multiple different servers rather than piling onto one.

Consistent Hashing: Server Changes and Data Movement ExpandConsistent Hashing: Server Changes and Data Movement

Why k Matters: Cascading Failure Prevention

With k=1 (one spot per server): when Server C crashes, all of C's users pile onto one neighbour. That server gets overwhelmed and crashes. Its users pile onto the next. Everything falls in sequence - cascading failure.

With k=64, C's load redistributes across the entire cluster. No single server absorbs a disproportionate surge. The system degrades gracefully instead of collapsing.

Higher k also improves baseline load distribution. Hash outputs look random even though they are deterministic - with k=1, you might get very unequal arc sizes purely by chance. With k=64, the law of large numbers kicks in: 64 independent positions average out toward uniform distribution.

The Full Comparison

RequirementRound-RobinMapping TableConsistent Hashing
FastYesYesYes - O(log n·k)
Equal distributionYesYesYes with k >= 32
Add/remove freelyNoYesYes - automatic
Minimal data movementNoYesYes - only affected users
No sync between LBsYesNoYes - ring is deterministic

Consistent hashing is the only algorithm that checks every box. This is why it is the standard approach for distributed routing - and why it comes up in nearly every large-scale system design interview.

[!TIP] Quick log₂ estimation: 2^10 ≈ 1,000, 2^20 ≈ 1,000,000, 2^30 ≈ 1,000,000,000. Memorise these three and you can estimate any log₂ value in seconds - it comes up constantly in system design interviews.

The Essentials

  1. The ring is a sorted array - "walking clockwise" is a binary search for the next virtual node, which runs in O(log n·k) time and requires no coordination between Load Balancers.
  2. Virtual nodes (k) control both distribution and failure resilience - with k=1, a server crash dumps all load onto one neighbour and risks cascading failure; with k=64, the load fans out evenly across the cluster.
  3. Consistent hashing satisfies all five routing requirements - fast, evenly distributed, flexible with cluster changes, minimal data movement, and fully deterministic without shared state.

Further Reading and Watching

The algorithm is solid. The next question is what the broader system architecture looks like when you apply it - and how real databases like Postgres, MongoDB, and Cassandra handle sharding differently.

Practice what you just read.

Quiz: Consistent Hashing
1 exercise