Consistent Hashing
Virtual nodes, hot-spot avoidance, rebalancing on node changes
Consistent hashing maps both data keys and server nodes onto a common hash ring (typically 0 to 2³²-1), so that only K/N keys need to be remapped when a node joins or leaves (K = total keys, N = nodes), compared to K keys in modulo hashing. Virtual nodes (vnodes) assign each physical server multiple positions on the ring, achieving statistical load balance and allowing heterogeneous capacity: a server with 2x RAM gets 2x the vnodes. Consistent hashing is foundational to Cassandra, DynamoDB, Riak, and distributed caches like Memcached with ketama.
Key Points
- Naive consistent hashing with one token per node causes uneven distribution (hot spots); 150–200 vnodes per physical node reduces maximum load imbalance to ±10% in practice.
- When a node is added, only its neighbors on the ring must stream data to it; when removed, its neighbors absorb its key ranges — rebalancing is incremental, not global.
- Cassandra's Murmur3Partitioner hashes partition keys onto a 64-bit ring (-2^63 to 2^63-1); vnodes are configured via num_tokens (default 16 in Cassandra 4.0, was 256 in earlier versions).
- Hot-spot avoidance: skewed key distributions (all writes to key "user:1") are not solved by consistent hashing alone — use request coalescing, application-level sharding of the hot key, or write-through cache.
- Replication factor (RF=3) means each token range is stored on the next 3 nodes clockwise on the ring; coordinator uses a replica placement strategy (NetworkTopologyStrategy for multi-DC).
- Load balancing in consistent hashing can be monitored with Cassandra's nodetool status; nodetool decommission triggers streaming of the leaving node's data to successors.
- Rendezvous hashing (highest random weight hashing) is an alternative: each client computes hash(key, node) for every node and routes to the highest score — simpler but O(N) per lookup.
- Jump consistent hash (Lamping & Veach, 2014) produces perfectly balanced buckets in O(ln N) time with minimal remapping but requires knowing N in advance.
Consistent hashing ring with 4 physical nodes (N0–N3) and virtual nodes (green dots). Each colored arc represents a node's key ownership range. Adding a node only affects its immediate neighbors.
Real-World Example
Amazon DynamoDB uses consistent hashing internally with automatic partition splitting when a partition exceeds 10 GB or 3000 RCU/1000 WCU; the partition key space is continuously repartitioned. Cassandra's virtual node architecture allows a new node to join and gradually absorb load from multiple existing nodes simultaneously rather than from a single neighbor.