Load balancing algorithms determine how incoming requests are distributed across a pool of backend servers, directly affecting latency, resource utilization, and session affinity. The correct algorithm depends on backend homogeneity, session requirements, and computational cost per request. Consistent hashing is the gold standard for caching and stateful services — used by Memcached, Cassandra, Varnish, and Nginx upstream modules — because it minimizes remapping when servers are added or removed from the pool.

Key Points

  • Round Robin: requests distributed sequentially across servers — simple, works well when all servers have equal capacity and request cost is uniform.
  • Weighted Round Robin: servers with higher weight receive proportionally more requests — useful for heterogeneous server pools (different CPU/RAM).
  • Least Connections: route to server with fewest active connections — better than round robin for variable-duration requests (e.g., WebSockets, long-polls).
  • IP Hash: hash client IP to select server — provides session affinity without shared session storage, but fails when clients are behind NAT.
  • Consistent Hashing: map servers and requests onto a ring; request routes to the nearest server clockwise — adding/removing a server remaps only 1/N of keys.
  • Random with Two Choices (Power of Two): pick 2 random servers, route to the least loaded — provably approaches optimal distribution with O(log log N) max load.
  • NGINX upstream: supports round_robin (default), least_conn, ip_hash, and hash directives — HAProxy supports additional algorithms including rdp-cookie for Windows RDP affinity.
  • Health checks: load balancers remove unhealthy backends from rotation — active health checks (HTTP GET /health every 5s) vs. passive (monitor error responses).
AlgorithmBest Use CaseSession StickinessCPU CostRebalancing on Change
Round RobinStateless, uniform requestsNoneVery LowFull redistribution
Least ConnectionsVariable request duration (WebSockets, streaming)NoneLowAutomatic (connection count-based)
IP HashSession affinity without RedisYes (by IP)LowPartial (IP remaps)
Consistent HashingCache/stateful services, distributed cachingYes (by key)MediumMinimal (1/N remaps)
Random Two ChoicesHigh-throughput, near-optimal balanceNoneVery LowAutomatic

Real-World Example

Amazon ELB uses Least Outstanding Requests (variant of Least Connections) for Application Load Balancers. Nginx Plus uses consistent hashing for upstream cache proxies. Twitch uses consistent hashing to route viewers to the same edge transcoding server for a given stream.