A Kafka partition is the unit of parallelism, ordering, and replication inside a topic. Every Kafka topic is split into one or more partitions; each partition is an ordered, append-only commit log of records stored on disk and replicated across brokers. Producers append to partitions in parallel and consumers in a group divide partitions among themselves - so a topic's partition count is the ceiling on its read and write throughput.
How a Kafka Partition Works
Each partition is a sequence of records with strictly increasing, gap-free offsets starting at 0. One broker is the leader for a partition; the others holding a replica are followers. Producers send to the leader, the leader appends to its local log, followers fetch the new records and append to theirs, and once min.insync.replicas followers have caught up the write is considered committed.
Topic: orders (partitions=3, replication.factor=3)
┌──────────────────────────────────────────────┐
│ Partition 0: leader=broker1 followers=2,3 │
│ Partition 1: leader=broker2 followers=1,3 │
│ Partition 2: leader=broker3 followers=1,2 │
└──────────────────────────────────────────────┘
Ordering is guaranteed within a partition and only within a partition. Across partitions, Kafka makes no claim about which message was produced first. To preserve ordering for related events (all orders for customer 42, all updates to row 99), the producer hashes a key and routes all records with that key to the same partition.
The default partitioner is deterministic: partition = murmur2(key) % num_partitions for keyed records, and a sticky round-robin for null-key records (with BuiltInPartitioner from Kafka 2.4+, which batches messages to a single partition until the batch flushes, then switches). Producers can supply a custom Partitioner to implement different routing logic.
Partition Configuration
| Setting | Scope | Default | What it controls |
|---|---|---|---|
num.partitions |
broker | 1 |
Default partition count when a topic is auto-created |
partitions |
topic | (set on create) | Partition count for a specific topic |
replication.factor |
topic | (set on create) | Replicas per partition; 3 in production |
min.insync.replicas |
topic | 1 |
Minimum in-sync replicas required for an acks=all write to succeed |
unclean.leader.election.enable |
topic | false |
Allow out-of-sync replica to be leader (data loss risk) |
replica.lag.time.max.ms |
broker | 30000 |
A follower is removed from ISR after this long without catching up |
partition.assignment.strategy |
consumer | RangeAssignor,CooperativeStickyAssignor |
How partitions are distributed across group members |
Pairing acks=all on the producer with min.insync.replicas=2 (or 3 with replication factor 3) is what gives a partition durable writes. Setting one without the other leaves a silent durability gap.
How Many Partitions Should a Topic Have?
This is the single most-argued Kafka design question. Rules of thumb that have aged well:
- Parallelism ceiling: a consumer group can have at most one consumer per partition processing simultaneously. Extras sit idle. If your peak concurrency target is 24 consumers, you need at least 24 partitions.
- Throughput target: most production deployments measure 10-50 MB/s of sustained throughput per partition. Divide your projected peak throughput by that to estimate the floor.
- Replication cost: each partition has overhead per replica - file handles, fetcher threads, metadata in the controller. With ZooKeeper-era Kafka the practical cluster ceiling was ~200,000 partitions. KRaft mode has demonstrated 2-million-partition clusters and 600,000-partition single brokers in lab benchmarks, but file-descriptor limits, replica-fetcher count, and metadata RAM still constrain real deployments.
- Future growth: you can increase partitions later but never decrease them. After an increase, keys hash to different partitions, so per-key ordering breaks across the change point.
A reasonable default for a new topic with no extreme throughput target is 6-12 partitions on a 3-broker cluster. Right-size after one quarter of running with real traffic rather than trying to predict.
Partition Leader Election and ISR
The in-sync replica set (ISR) is the set of followers that are caught up to the leader within replica.lag.time.max.ms. If the leader dies, the controller elects a new leader from the ISR. With unclean.leader.election.enable=false (the default and recommended) the cluster prefers unavailability over data loss - if the ISR is empty, the partition goes offline. With unclean.leader.election.enable=true a non-ISR replica can be elected, losing any records the old leader had committed but the new leader hadn't replicated.
Frequent leader changes (LeaderElectionRateAndTimeMs spiking) usually point to network instability, GC pauses on brokers, or under-provisioned disks. They're cheap when stable, expensive when constant.
Common Mistakes with Kafka Partitions
- Over-partitioning small topics. 1,000 partitions for a topic doing 100 msg/s adds replication overhead, fetcher threads, and metadata cost for no throughput gain.
- Forgetting that adding partitions breaks key ordering. Records with the same key produced before and after a partition increase land on different partitions. Downstream consumers that depended on per-key order are silently broken.
- Setting
replication.factor=1in production. A single broker failure permanently loses the partition's data. Use 3. - Skipping
min.insync.replicas.acks=allalone doesn't guarantee durability - it only waits for the current ISR, which can be 1 if followers fell behind. Setmin.insync.replicas=2for replication factor 3. - Enabling unclean leader election to "improve availability". It trades silent data loss for uptime - rarely the right call.
- Letting one partition dominate. Hot keys (one customer ID accounts for 80% of writes) make one partition the bottleneck regardless of how many you have. Look for partition skew in monitoring, then redesign the key.
Monitoring Kafka Partitions
Metrics that matter for partition health:
UnderReplicatedPartitions- should be 0 in steady state. Any nonzero value means a broker is behind or unavailable.OfflinePartitionsCount- partitions with no live leader. Any nonzero value is a serving outage.IsrShrinksPerSec/IsrExpandsPerSec- oscillation indicates lagging followers, often due to slow disks or network.- Partition skew (bytes-in per partition) - flags hot keys and bad partitioning.
LeaderElectionRateAndTimeMs- frequent re-elections signal cluster instability.- Log end offset per partition - growth rate is the producer's throughput; flat means producers stopped.
Pulse provides AI-powered monitoring for Kafka, Elasticsearch, OpenSearch, and ClickHouse. For partitions specifically, it correlates under-replication, ISR shrinks, leader churn, and partition skew into one signal with root-cause context - instead of three separate dashboards full of raw counters.
Frequently Asked Questions
Q: What is the difference between a Kafka topic and a partition?
A: A topic is the logical channel producers write to and consumers subscribe to. A partition is the physical, ordered, append-only log on disk that stores a slice of the topic's records. One topic always has one or more partitions; partitions are how Kafka achieves parallelism and replication.
Q: How many partitions should a Kafka topic have?
A: Enough to support your peak consumer concurrency and your peak throughput target. Most production topics start with 6-12 partitions; high-throughput topics may use 100+. Add only as needed - each partition costs file handles, replication bandwidth, and metadata overhead.
Q: Can I increase the number of Kafka partitions for an existing topic?
A: Yes, with kafka-topics.sh --alter --topic <name> --partitions <new-count>. The catch: the partitioner hashes keys modulo partition count, so records with the same key produced before and after the change land on different partitions. Per-key ordering is broken across the change point.
Q: Can I decrease the number of partitions in Kafka?
A: No. Decreasing partitions is not supported. The only path is to create a new topic with the desired partition count, copy data using MirrorMaker 2 or Kafka Connect, switch producers and consumers, then delete the old topic.
Q: How does Kafka guarantee message ordering within a partition?
A: Each record in a partition is assigned a unique, gap-free, monotonically increasing offset by the leader at append time. Consumers fetch records in offset order. Ordering is only guaranteed within a single partition - across partitions, Kafka makes no claim about message order.
Q: What happens when a Kafka partition leader fails?
A: The controller detects the failure (via lost heartbeat / ZooKeeper or KRaft metadata change) and elects a new leader from the partition's in-sync replicas. Producers and consumers refresh metadata and reconnect to the new leader. The window is typically seconds; data committed to the ISR before the failure is preserved.
Q: How does partitioning affect consumer groups?
A: Each partition is assigned to exactly one consumer within a group at a time. If a group has more consumers than partitions, the excess sit idle. If it has fewer consumers than partitions, some consumers handle multiple partitions. This is why the partition count is the upper bound on a group's parallel processing.
Related Reading
- Kafka Topic: the logical container around a set of partitions
- Kafka Broker: the server that hosts partition leaders and followers
- Kafka Producer: how records are routed to partitions
- Kafka Consumer: how partitions are divided among a consumer group
- Consumer Offset: how consumers track per-partition position
- Kafka Commit Log: the on-disk storage backing each partition
- Kafka Controller: the broker that manages partition leader election
- Apache Kafka Glossary: all Kafka terms in one place