The Transactional Outbox pattern solves the dual-write problem: atomically writing to a database and publishing a message to a broker is impossible without distributed transactions. Instead, the service writes the event to an outbox table in the same local database transaction as the business entity update — guaranteeing consistency. A separate relay process reads unpublished outbox records and publishes them to the message broker, then marks them as published. Debezium implements this via CDC (Change Data Capture), reading the database WAL/binlog to detect outbox inserts without polling.

Key Points

  • Dual-write problem: if you write to the DB and then publish to Kafka, a crash between the two leaves them inconsistent — either the DB write is lost (if crash before DB commit) or the event is missing (if crash after DB commit but before Kafka publish).
  • Outbox table columns: id (UUID), aggregate_type, aggregate_id, event_type, payload (JSON/Avro), created_at, published_at (nullable) — the relay sets published_at after successful broker delivery.
  • Polling relay (simple implementation): a background job queries SELECT * FROM outbox WHERE published_at IS NULL ORDER BY created_at LIMIT 100 every second; publishes each batch; marks as published — adds DB load but simple to implement.
  • Debezium CDC relay: Debezium reads PostgreSQL WAL or MySQL binlog via logical replication slot; detects INSERT on the outbox table; publishes to Kafka with exactly-once semantics (Kafka connector + Kafka transactions).
  • Idempotent consumer still required: the relay provides at-least-once delivery (it may publish duplicates on restart); consumers must deduplicate using the outbox event ID or a processed-events table.
  • Outbox table cleanup: archive or delete published rows older than N days to prevent unbounded growth; partition the outbox table by date for efficient cleanup.
  • Aggregate ID routing: Debezium routes outbox events to Kafka topics and partitions by aggregate ID, ensuring all events for the same order go to the same partition — preserving per-aggregate ordering.
  • Hybrid: write-ahead log approach (Postgres logical replication directly) vs outbox table — WAL approach has no extra table write but is tightly coupled to the DB engine; outbox is portable.

Real-World Example

Debezium (Red Hat) was purpose-built for the outbox pattern and is now the de-facto standard for CDC-based event publishing; it is deployed at scale at Uber, WePay, and Booking.com. Eventuate Tram is an open-source framework that provides outbox pattern support for Java microservices, managing the outbox table schema, relay, and consumer deduplication.