Event Streaming (Kafka) | Integration & APIs | System Design

Apache Kafka is a distributed, durable, high-throughput event streaming platform that treats data as an ordered, immutable log rather than a queue of messages to be consumed and deleted. Kafka decouples producers and consumers through topics divided into partitions, where each partition is an ordered, replicated log stored on disk. Consumer groups enable horizontal scaling: each partition is assigned to exactly one consumer in a group, but the same partition can be consumed by multiple groups independently — enabling fan-out at scale without duplication.

Key Points

Topic partitions: a topic with P partitions can be consumed by at most P consumers in a single consumer group; adding more consumers than partitions leaves some consumers idle.
Partition assignment: Kafka's coordinator assigns partitions to consumers on group join; rebalancing triggers when a consumer joins, leaves, or crashes — causes a brief pause in consumption (stop-the-world).
Offset management: consumers track their position per partition using offsets; committed to __consumer_offsets topic; earliest (replay all) vs latest (skip history) vs specific offset.
Producer acknowledgment modes: acks=0 (no guarantee), acks=1 (leader written), acks=all (all ISR replicas written) — acks=all + min.insync.replicas=2 ensures no data loss on leader failure.
Retention policies: time-based (delete.retention.ms, default 7 days), size-based (retention.bytes), or log compaction (keep latest value per key) — mix policies per topic based on use case.
Kafka consumer lag: the difference between the latest produced offset and the last committed consumer offset per partition; monitor with kafka-consumer-groups.sh or Burrow; alert when lag grows unbounded.
Kafka Streams vs Apache Flink: Kafka Streams is a Java library (no separate cluster, runs in-process); Flink is a separate cluster with richer windowing, ML operator support, and better checkpointing for large state.
Kafka Connect: source connectors (Debezium CDC from MySQL/PostgreSQL, S3 source) and sink connectors (Elasticsearch, S3, JDBC) manage data movement without custom code.

Kafka architecture: Producers write to topic partitions on the broker. Consumer Group A has 3 consumers (one per partition). Consumer Group B independently consumes the same topic with 2 consumers. Each group maintains its own offset per partition.

Real-World Example

LinkedIn processes over 7 trillion messages per day through Kafka, using it as the backbone for their activity data pipeline, metrics, and log aggregation. Uber uses Kafka for real-time driver location updates (millions of GPS events/second), routing events through multiple consumer groups: dispatch, ETA computation, and surge pricing each independently consume the same location topic.

←PreviousMessage Queues NextPub/Sub Pattern→