CAP Applied to Databases
Choosing CP vs AP databases for specific use cases
The CAP theorem states that a distributed system can guarantee only two of three properties simultaneously: Consistency (every read sees the most recent write), Availability (every request receives a response), and Partition Tolerance (the system continues operating despite network partitions). Since network partitions are unavoidable in real distributed systems, the practical choice is between CP (sacrifice availability during partitions) and AP (serve stale or conflicting data rather than refusing requests). The PACELC model extends this by considering the latency vs consistency trade-off even without partitions.
Key Points
- CP databases (ZooKeeper, etcd, HBase, Spanner) will refuse reads/writes rather than serve stale data during a partition — suitable for financial ledgers, inventory counts.
- AP databases (Cassandra, DynamoDB, CouchDB, Riak) remain available during partitions but may return stale or conflicting data, requiring conflict resolution (LWW, vector clocks, CRDTs).
- Cassandra's consistency is tunable per-operation: QUORUM (R+W > N) gives strong consistency; ONE gives eventual consistency — not purely AP in practice.
- PostgreSQL single-node is CA (consistency + availability) — but adding synchronous streaming replication makes it CP, as the primary blocks on writes if the synchronous standby is unreachable.
- DynamoDB is AP by default with eventual consistency reads (read any replica); strongly consistent reads (CP) are available at 2x the read capacity unit cost.
- MongoDB with majority write concern + majority read concern is effectively CP; with default settings (primary reads, w:1), it can serve stale data after failover.
- PACELC for Cassandra: PA/EL — during Partition chooses Availability; during normal operation chooses Low Latency over Consistency.
- Linearizability (strongest consistency) requires all operations to appear atomic and instantaneous — achievable with consensus (Raft/Paxos) at latency cost.
| Database | CAP Class | Consistency Model | Notes |
|---|---|---|---|
| PostgreSQL (single node) | CA | Serializable / MVCC | Not partition-tolerant by design; add Patroni for HA |
| PostgreSQL (sync replica) | CP | Strong (synchronous replication) | Primary blocks writes if sync standby is down |
| MySQL InnoDB Cluster | CP | Strong (Group Replication, majority) | Loses availability if majority of nodes fail |
| Amazon Aurora | CP | Read-after-write consistency | Quorum-based storage; 6 copies across 3 AZs |
| Apache Cassandra | AP (tunable) | Eventual / tunable to strong | QUORUM reads+writes give strong consistency |
| Amazon DynamoDB | AP (tunable) | Eventual (default) / Strong (opt-in) | Global Tables use last-writer-wins |
| MongoDB (majority) | CP | Causally consistent sessions | Majority write+read concern required |
| Apache HBase | CP | Strong (ZooKeeper coordination) | Prefers consistency over availability |
| CockroachDB | CP | Serializable (distributed) | Consensus-based, linearizable reads |
| Redis Cluster | AP | Eventual (async replication) | Can lose acknowledged writes on failover |
| Apache ZooKeeper | CP | Sequential consistency | Refuses writes if quorum unavailable |
| etcd | CP | Linearizable | Raft consensus; used by Kubernetes for state |
Real-World Example
Discord chose Cassandra (AP) for message storage because showing a message a few milliseconds late is acceptable, but refusing to deliver messages is not. Banks use CP databases (Oracle RAC, CockroachDB) for account balances because showing a stale balance could enable overdrafts.