Cache invalidation is notoriously described as one of the two hard problems in computer science (along with naming things). The challenge is ensuring cached data accurately reflects the source of truth without either serving stale data or excessive invalidation that eliminates cache benefits. Strategies span from simple TTL-based expiry to event-driven invalidation, versioned cache keys, and probabilistic techniques for stampede prevention. The correct strategy depends on acceptable staleness, write frequency, and whether invalidation events are observable.

Key Points

  • TTL (Time-To-Live): simplest approach — every cache entry expires after a fixed duration; staleness window equals TTL, e.g., 60s is acceptable for product catalog, 0s for bank balance.
  • Event-driven invalidation: write operations publish invalidation events (Kafka, Redis Pub/Sub); cache listeners delete or update affected keys immediately — near-zero staleness.
  • Versioned keys: embed version or timestamp in cache key (`user:123:v7`) — old keys are simply never read, no explicit invalidation needed; requires garbage collection of orphaned keys.
  • Write-through invalidation: on DB write, atomically delete cache key (cache-aside + write invalidation) — simpler than event-driven, synchronous cost on write path.
  • Cache stampede (thundering herd): when a popular key expires, thousands of simultaneous requests miss and all query DB — prevented by mutex lock, early probabilistic expiry, or background refresh.
  • Probabilistic early expiry (XFetch): cache entry self-expires slightly before TTL with probability proportional to recomputation time — smooths expiry spikes for expensive computations.
  • Distributed invalidation: in multi-region setups, invalidation events must propagate to all regions — use global Pub/Sub (Redis Global) or CDN tag-based purge APIs.
  • Tag-based invalidation: Fastly and Varnish support surrogate keys — a single API call invalidates all cache entries tagged with e.g., `product:456`, enabling surgical invalidation.

Real-World Example

Facebook's Memcache invalidation uses a "Mcrouter" proxy that intercepts DB writes and issues cache deletes atomically via a two-phase commit protocol. Their "look-aside" invalidation handles 1 billion invalidation events per day across 1,000+ Memcache servers.