Replication | Databases & Data Architecture | System Design

Database replication asynchronously or synchronously copies data from a primary (leader) to one or more replicas (followers) to achieve read scalability, geographic distribution, and high availability. The core trade-off is replication lag: asynchronous replication is faster but allows replicas to serve stale data; synchronous replication guarantees consistency but increases write latency and reduces availability if a replica is slow or unreachable. Multi-master replication allows writes on multiple nodes, requiring conflict detection and resolution strategies.

Key Points

Asynchronous replication (MySQL binlog, PostgreSQL streaming): primary commits before replica acknowledges; replication lag can range from milliseconds to minutes under load.
Synchronous replication (PostgreSQL synchronous_commit=on): primary waits for at least one replica to flush WAL before returning; adds ~RTT latency to each write but guarantees no data loss on primary failure.
Read replicas absorb 60–80% of read traffic in typical OLTP systems; route SELECT queries to replicas, INSERT/UPDATE/DELETE to primary via application-level or proxy-level routing (ProxySQL, pgBouncer).
Replication lag monitoring: track seconds_behind_master (MySQL) or pg_stat_replication.replay_lag (PostgreSQL); alert at >30 seconds for most OLTP applications.
Multi-master (active-active): Galera Cluster (PXC), CockroachDB, Spanner allow writes to any node; conflicts are resolved by last-writer-wins or by aborting one transaction.
Conflict-free Replicated Data Types (CRDTs) enable multi-master replication without conflicts for specific data types: counters, sets, registers — used in Riak and DynamoDB DAX.
Failover with auto-promotion: Patroni (PostgreSQL), MHA (MySQL), Orchestrator manage leader election using distributed consensus (etcd, ZooKeeper) to avoid split-brain.
Logical replication (PostgreSQL logical decoding, MySQL row-based binlog) enables selective table replication, cross-version replication, and CDC pipelines to Kafka via Debezium.

Real-World Example

GitHub uses MySQL with Orchestrator for automatic failover; they maintain 3 replicas per primary and promote the most up-to-date replica during failover, tolerating up to 30 seconds of lag. Cloudflare replicates their core PostgreSQL database synchronously within a region (sub-5ms RTT) and asynchronously across regions for disaster recovery.

←PreviousSharding & Partitioning NextOLTP vs OLAP→