What is a Kafka Partition? Apache Kafka Partitions Explained with Examples

A Kafka partition is the unit of parallelism, ordering, and replication inside a topic. Every Kafka topic is split into one or more partitions; each partition is an ordered, append-only commit log of records stored on disk and replicated across brokers. Producers append to partitions in parallel and consumers in a group divide partitions among themselves - so a topic's partition count is the ceiling on its read and write throughput.

How a Kafka Partition Works

Each partition is a sequence of records with strictly increasing, gap-free offsets starting at 0. One broker is the leader for a partition; the others holding a replica are followers. Producers send to the leader, the leader appends to its local log, followers fetch the new records and append to theirs, and once min.insync.replicas followers have caught up the write is considered committed.

Topic: orders (partitions=3, replication.factor=3)
 ┌──────────────────────────────────────────────┐
 │ Partition 0: leader=broker1  followers=2,3   │
 │ Partition 1: leader=broker2  followers=1,3   │
 │ Partition 2: leader=broker3  followers=1,2   │
 └──────────────────────────────────────────────┘

Ordering is guaranteed within a partition and only within a partition. Across partitions, Kafka makes no claim about which message was produced first. To preserve ordering for related events (all orders for customer 42, all updates to row 99), the producer hashes a key and routes all records with that key to the same partition.

The default partitioner is deterministic: partition = murmur2(key) % num_partitions for keyed records, and a sticky round-robin for null-key records (with BuiltInPartitioner from Kafka 2.4+, which batches messages to a single partition until the batch flushes, then switches). Producers can supply a custom Partitioner to implement different routing logic.

Partition Configuration

Setting Scope Default What it controls
num.partitions broker 1 Default partition count when a topic is auto-created
partitions topic (set on create) Partition count for a specific topic
replication.factor topic (set on create) Replicas per partition; 3 in production
min.insync.replicas topic 1 Minimum in-sync replicas required for an acks=all write to succeed
unclean.leader.election.enable topic false Allow out-of-sync replica to be leader (data loss risk)
replica.lag.time.max.ms broker 30000 A follower is removed from ISR after this long without catching up
partition.assignment.strategy consumer RangeAssignor,CooperativeStickyAssignor How partitions are distributed across group members

Pairing acks=all on the producer with min.insync.replicas=2 (or 3 with replication factor 3) is what gives a partition durable writes. Setting one without the other leaves a silent durability gap.

How Many Partitions Should a Topic Have?

This is the single most-argued Kafka design question. Rules of thumb that have aged well:

  • Parallelism ceiling: a consumer group can have at most one consumer per partition processing simultaneously. Extras sit idle. If your peak concurrency target is 24 consumers, you need at least 24 partitions.
  • Throughput target: most production deployments measure 10-50 MB/s of sustained throughput per partition. Divide your projected peak throughput by that to estimate the floor.
  • Replication cost: each partition has overhead per replica - file handles, fetcher threads, metadata in the controller. With ZooKeeper-era Kafka the practical cluster ceiling was ~200,000 partitions. KRaft mode has demonstrated 2-million-partition clusters and 600,000-partition single brokers in lab benchmarks, but file-descriptor limits, replica-fetcher count, and metadata RAM still constrain real deployments.
  • Future growth: you can increase partitions later but never decrease them. After an increase, keys hash to different partitions, so per-key ordering breaks across the change point.

A reasonable default for a new topic with no extreme throughput target is 6-12 partitions on a 3-broker cluster. Right-size after one quarter of running with real traffic rather than trying to predict.

Partition Leader Election and ISR

The in-sync replica set (ISR) is the set of followers that are caught up to the leader within replica.lag.time.max.ms. If the leader dies, the controller elects a new leader from the ISR. With unclean.leader.election.enable=false (the default and recommended) the cluster prefers unavailability over data loss - if the ISR is empty, the partition goes offline. With unclean.leader.election.enable=true a non-ISR replica can be elected, losing any records the old leader had committed but the new leader hadn't replicated.

Frequent leader changes (LeaderElectionRateAndTimeMs spiking) usually point to network instability, GC pauses on brokers, or under-provisioned disks. They're cheap when stable, expensive when constant.

Common Mistakes with Kafka Partitions

  1. Over-partitioning small topics. 1,000 partitions for a topic doing 100 msg/s adds replication overhead, fetcher threads, and metadata cost for no throughput gain.
  2. Forgetting that adding partitions breaks key ordering. Records with the same key produced before and after a partition increase land on different partitions. Downstream consumers that depended on per-key order are silently broken.
  3. Setting replication.factor=1 in production. A single broker failure permanently loses the partition's data. Use 3.
  4. Skipping min.insync.replicas. acks=all alone doesn't guarantee durability - it only waits for the current ISR, which can be 1 if followers fell behind. Set min.insync.replicas=2 for replication factor 3.
  5. Enabling unclean leader election to "improve availability". It trades silent data loss for uptime - rarely the right call.
  6. Letting one partition dominate. Hot keys (one customer ID accounts for 80% of writes) make one partition the bottleneck regardless of how many you have. Look for partition skew in monitoring, then redesign the key.

Monitoring Kafka Partitions

Metrics that matter for partition health:

  • UnderReplicatedPartitions - should be 0 in steady state. Any nonzero value means a broker is behind or unavailable.
  • OfflinePartitionsCount - partitions with no live leader. Any nonzero value is a serving outage.
  • IsrShrinksPerSec / IsrExpandsPerSec - oscillation indicates lagging followers, often due to slow disks or network.
  • Partition skew (bytes-in per partition) - flags hot keys and bad partitioning.
  • LeaderElectionRateAndTimeMs - frequent re-elections signal cluster instability.
  • Log end offset per partition - growth rate is the producer's throughput; flat means producers stopped.

Pulse provides AI-powered monitoring for Kafka, Elasticsearch, OpenSearch, and ClickHouse. For partitions specifically, it correlates under-replication, ISR shrinks, leader churn, and partition skew into one signal with root-cause context - instead of three separate dashboards full of raw counters.

Frequently Asked Questions

Q: What is the difference between a Kafka topic and a partition?
A: A topic is the logical channel producers write to and consumers subscribe to. A partition is the physical, ordered, append-only log on disk that stores a slice of the topic's records. One topic always has one or more partitions; partitions are how Kafka achieves parallelism and replication.

Q: How many partitions should a Kafka topic have?
A: Enough to support your peak consumer concurrency and your peak throughput target. Most production topics start with 6-12 partitions; high-throughput topics may use 100+. Add only as needed - each partition costs file handles, replication bandwidth, and metadata overhead.

Q: Can I increase the number of Kafka partitions for an existing topic?
A: Yes, with kafka-topics.sh --alter --topic <name> --partitions <new-count>. The catch: the partitioner hashes keys modulo partition count, so records with the same key produced before and after the change land on different partitions. Per-key ordering is broken across the change point.

Q: Can I decrease the number of partitions in Kafka?
A: No. Decreasing partitions is not supported. The only path is to create a new topic with the desired partition count, copy data using MirrorMaker 2 or Kafka Connect, switch producers and consumers, then delete the old topic.

Q: How does Kafka guarantee message ordering within a partition?
A: Each record in a partition is assigned a unique, gap-free, monotonically increasing offset by the leader at append time. Consumers fetch records in offset order. Ordering is only guaranteed within a single partition - across partitions, Kafka makes no claim about message order.

Q: What happens when a Kafka partition leader fails?
A: The controller detects the failure (via lost heartbeat / ZooKeeper or KRaft metadata change) and elects a new leader from the partition's in-sync replicas. Producers and consumers refresh metadata and reconnect to the new leader. The window is typically seconds; data committed to the ISR before the failure is preserved.

Q: How does partitioning affect consumer groups?
A: Each partition is assigned to exactly one consumer within a group at a time. If a group has more consumers than partitions, the excess sit idle. If it has fewer consumers than partitions, some consumers handle multiple partitions. This is why the partition count is the upper bound on a group's parallel processing.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.