A Kafka consumer offset is the integer position a consumer group has read up to within a single partition. Every message in a partition has a unique, monotonically increasing offset; consumers commit the offset of the last message they have successfully processed back to Kafka. On restart or rebalance, the consumer resumes from the committed offset, which is how Kafka delivers at-least-once or exactly-once semantics without per-message acknowledgements.
How Kafka Consumer Offsets Work
Kafka stores committed offsets in an internal compacted topic called __consumer_offsets. It has 50 partitions by default (controlled by the broker-side offsets.topic.num.partitions), replication factor 3 by default (offsets.topic.replication.factor), and cleanup.policy=compact so only the latest offset per (group, topic, partition) is retained. A specific consumer group's offsets all hash to the same __consumer_offsets partition, whose leader broker becomes the group coordinator for that group.
When a consumer calls commitSync() or commitAsync(), the client sends an OffsetCommit request to the group coordinator, which appends a record to __consumer_offsets keyed by (group, topic, partition) with the offset as the value. When the consumer restarts or a new member joins the group, the coordinator returns the latest committed offsets via OffsetFetch and members resume from there.
# Inspect committed offsets for a group
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
--describe --group orders-processor
# Reset offsets to the earliest available record
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
--group orders-processor --reset-offsets --to-earliest \
--topic orders --execute
Offsets are committed per partition. A group with 12 partitions and a committed offset of 1,000,000 in each effectively has 12 independent positions. The next message a consumer assigned to partition 3 reads is offset 1,000,001 in partition 3 - no other partition is affected.
Kafka Offset Commit Configuration
The settings that control commit behavior:
| Setting | Default | What it does |
|---|---|---|
enable.auto.commit |
true |
Periodically auto-commits the latest offset returned by poll() |
auto.commit.interval.ms |
5000 |
Interval between auto-commits when enabled |
auto.offset.reset |
latest |
Where to start when there is no committed offset: latest, earliest, or none |
isolation.level |
read_uncommitted |
read_committed skips aborted transactional records when computing lag |
offsets.retention.minutes |
10080 (7 days) |
Broker-side: how long committed offsets are retained for inactive groups |
session.timeout.ms |
45000 |
Group coordinator removes a member after this long without heartbeats |
max.poll.interval.ms |
300000 |
Coordinator removes the consumer if poll() is not called this often |
enable.auto.commit=true and auto.commit.interval.ms=5000 mean Kafka commits whatever offset poll() last returned every 5 seconds, during the next call to poll(). The risk: if your consumer reads 500 records, then processes them slowly, the auto-commit can fire while records are still in-flight - giving you "committed but not actually processed" semantics. The safer pattern for at-least-once is enable.auto.commit=false with a manual commitSync() after processing succeeds.
auto.offset.reset
When a group has no committed offset for a partition (new group, or offset retention has expired), auto.offset.reset decides:
latest(default): start at the log end offset - new messages only, anything older is skipped.earliest: start at the log start offset - replay everything still on disk.none: throwNoOffsetForPartitionExceptionand let the application decide.
Picking earliest on a high-volume topic for a fresh group means the consumer reads everything in retention before catching up. Picking latest on an existing group whose offsets expired silently drops messages. Both are common production incidents.
Commit Strategies and Delivery Semantics
Three patterns dominate:
- Auto-commit (at-most or at-least-once, fuzzy).
enable.auto.commit=true. Simple but the timing of commits doesn't align with processing, which can produce both duplicates and losses depending on where the consumer crashes. - Manual sync commit after processing (at-least-once).
enable.auto.commit=false. Process the batch, thenconsumer.commitSync(). On crash, the last batch is reprocessed - the consumer must be idempotent. This is the default for most production pipelines. - Transactional / exactly-once. Producer is configured with
enable.idempotence=trueandtransactional.id, and the consumer reads withisolation.level=read_committed. The processing app writes results and commits offsets in the same Kafka transaction. Used by Kafka Streams and increasingly common in Connect sinks.
commitSync() is blocking and retries automatically. commitAsync() is fire-and-forget and faster, but failed commits are not retried. A common production pattern is commitAsync() in the hot path and commitSync() on shutdown to flush the final position.
Common Mistakes with Consumer Offsets
- Trusting
auto.commitwith slow processing. If apoll()returns 500 records and processing each takes 100 ms, auto-commit fires after 5 seconds with only 50 records actually done. The other 450 are "committed" without being processed - on crash they're lost. - Letting offsets expire.
offsets.retention.minutesdefaults to 7 days. A consumer group that goes idle for 8 days loses its committed offsets entirely; the next start followsauto.offset.reset. Long-idle groups need a heartbeat-only consumer or a longer retention. - Resetting offsets on a running consumer.
kafka-consumer-groups.sh --reset-offsetsfails if the group has active members. Always stop the consumers first, or pass--dry-runto see what would change before--execute. - Committing offset N when message N-1 succeeded. The committed offset is the offset of the next message to read, not the last one processed. Off-by-one mistakes in custom offset management lead to either skipping a record or replaying it on every restart.
- Treating offsets as global. Offsets are per-partition. Comparing offset 1000 in partition 0 to offset 1000 in partition 1 tells you nothing about which message was produced first.
Monitoring Kafka Consumer Offsets
What to watch:
- Consumer lag =
log_end_offset - committed_offset. See the consumer lag page for details. - Commit failure rate. Frequent
OffsetCommiterrors indicate coordinator instability or a network partition. __consumer_offsetstopic size. Should stay small because of compaction. Runaway growth means the log cleaner is failing.- Offset reset events. A consumer hitting
auto.offset.reset=latestbecause its commits expired is a silent data-loss event; alert on it. - Group rebalance rate. Each rebalance interrupts processing and can produce duplicate records on the new owner.
Pulse monitors Kafka end-to-end including offset commit health, __consumer_offsets topic state, group coordinator failures, offset expirations, and lag patterns - with AI-driven root cause analysis instead of raw metrics. Pulse also covers Elasticsearch, OpenSearch, and ClickHouse for teams running mixed streaming and search stacks.
Frequently Asked Questions
Q: Where does Kafka store consumer offsets?
A: In the internal compacted topic __consumer_offsets, which has 50 partitions and replication factor 3 by default. The group coordinator (the broker that hosts the leader for the relevant __consumer_offsets partition) handles all commit and fetch requests for a given group.
Q: How often should I commit offsets in Kafka?
A: Frequently enough that a crash doesn't force you to reprocess a huge backlog, but not so often that commit traffic dominates throughput. With enable.auto.commit=true, every 5 seconds is the default. With manual commits, committing per batch (after every poll() once processing succeeds) is the common production choice.
Q: What happens if a consumer crashes before committing its offset?
A: On restart, the consumer fetches the last committed offset for each partition from the group coordinator and resumes from there. Any records processed but not yet committed will be re-delivered - which is why consumers must be idempotent for at-least-once semantics.
Q: Can I manually reset a Kafka consumer offset?
A: Yes. Use kafka-consumer-groups.sh --reset-offsets with --to-earliest, --to-latest, --to-offset N, --to-datetime ISO_TIMESTAMP, or --shift-by N. The group must have no active members. Programmatically, consumer.seek(partition, offset) works while the consumer is running.
Q: What is the difference between auto.commit.enable and manual offset commit?
A: auto.commit.enable (renamed enable.auto.commit) commits the latest offset returned by poll() periodically, independent of whether your processing has succeeded. Manual commit (commitSync() or commitAsync()) lets you commit only after processing finishes - the standard pattern for at-least-once delivery. Manual commits are required for any meaningful delivery guarantee.
Q: What happens when a consumer's committed offset expires?
A: After offsets.retention.minutes of group inactivity (default 7 days), the broker removes the group's offsets. On the next start, the group is treated as new and follows auto.offset.reset. With latest, it skips everything that arrived during the outage; with earliest, it replays from the start of retention.
Q: How does Kafka achieve exactly-once semantics with offsets?
A: The producer is enabled with enable.idempotence=true and a transactional.id, and the application's output writes and offset commits run inside a single Kafka transaction (producer.sendOffsetsToTransaction()). Consumers downstream set isolation.level=read_committed to skip aborted records. This is what Kafka Streams uses internally with processing.guarantee=exactly_once_v2.
Related Reading
- Kafka Consumer: the client that commits offsets
- Kafka Consumer Lag: the metric derived from committed offsets vs log end
- Kafka Topic: where offsets are scoped
- Kafka Partition: the unit each offset applies to
- Kafka Commit Log: the storage that offsets index into
- Kafka Broker: hosts the group coordinator
- Apache Kafka Glossary: all Kafka terms in one place