The Logstash Kafka Leader not available (Java exception LeaderNotAvailableException) error is raised by the Kafka client embedded in the logstash-input-kafka and logstash-output-kafka plugins. It means the Kafka cluster reports that one or more partitions of a topic currently have no leader broker. The condition is usually transient - it appears during broker restart, controller failover, or partition reassignment - and the Kafka client retries automatically. Persistent occurrences indicate a deeper cluster health problem: under-replicated partitions, broker network isolation, or auto.create.topics.enable=false with a topic that does not yet exist.
What This Error Means
Every Kafka partition has exactly one leader broker at any time. Producers send and consumers fetch from that leader. When the leader is unavailable - the broker is restarting, the controller is electing a new leader, or all in-sync replicas are offline - the cluster temporarily reports LeaderNotAvailableException to clients. The Kafka client library inside the Logstash plugin retries these errors transparently, governed by retries, retry_backoff_ms, and request_timeout_ms.
The error surfaces to Logstash logs only when retries are exhausted, which means the condition has persisted longer than the configured retry window. By default, the producer retries for ~120 seconds; the consumer retries for request_timeout_ms (default 30 s) per fetch.
Common Causes
- Broker restart in progress (planned or crash). Confirm with
kafka-broker-api-versions.sh --bootstrap-server <host>- missing brokers are unreachable. - Controller election in flight. Confirm via
kafka-metadata-quorum.sh --status(KRaft) or ZooKeeper/controllerznode. Election usually completes in seconds. - Topic does not exist and
auto.create.topics.enable=false. Confirm withkafka-topics.sh --list. The Kafka producer hides the "topic does not exist" condition behind the sameLeaderNotAvailableExceptionfor security reasons. - All in-sync replicas (ISR) for a partition are offline. Confirm with
kafka-topics.sh --describe --topic <name>- partitions with emptyIsrcannot elect a leader unlessunclean.leader.election.enableis true. - Partition reassignment in progress. Confirm with
kafka-reassign-partitions.sh --verify. - Network partition between Logstash and one broker while bootstrap.servers still resolves another. The client may receive stale metadata pointing at the unreachable broker. Confirm with
tcptraceroutefrom the Logstash host to each broker. bootstrap.serversmisconfigured. Confirm the addresses matchlisteners/advertised.listenerson the brokers.
How to Fix the Logstash Kafka Leader Not Available Error
Check Kafka cluster health from the Logstash host:
kafka-broker-api-versions.sh --bootstrap-server broker1:9092,broker2:9092 kafka-topics.sh --bootstrap-server broker1:9092 --describe --topic eventsLook for missing brokers, empty
Isr, andLeader: -1(no leader).If the error is transient (broker restart, election), wait 30-60 seconds. Kafka's metadata propagation and Logstash's retry logic handle this case without intervention.
If the topic does not exist, create it explicitly rather than relying on auto-creation:
kafka-topics.sh --bootstrap-server broker1:9092 --create \ --topic events --partitions 12 --replication-factor 3If partitions have no leader and ISR is empty, the partition is unrecoverable without intervention. Either bring the offline brokers back (preferred) or enable
unclean.leader.election.enable=trueon the topic to allow an out-of-sync replica to become leader, accepting potential data loss.Tune Logstash retry settings for transient blips:
input { kafka { bootstrap_servers => "broker1:9092,broker2:9092,broker3:9092" topics => [ "events" ] retry_backoff_ms => 1000 reconnect_backoff_ms => 1000 request_timeout_ms => 60000 session_timeout_ms => 30000 } } output { kafka { bootstrap_servers => "broker1:9092,broker2:9092,broker3:9092" topic_id => "processed" retries => 10 retry_backoff_ms => 1000 request_timeout_ms => 60000 acks => "all" } }Verify
advertised.listenerson every broker resolves correctly from the Logstash host. The most insidious cause of persistent leader-not-available errors is an advertised hostname that only resolves inside the cluster network.Restart Logstash only if the Kafka client metadata cache is stuck after the cluster has clearly recovered. This is rare; the client refreshes metadata every
metadata.max.age.ms(default 5 minutes).
Diagnose Logstash Kafka Leader Not Available Errors Automatically with Pulse
Pulse is the only monitoring and optimization platform built specifically for Logstash. When the Kafka input or output plugin starts raising LeaderNotAvailableException and retries are not clearing it, Pulse:
- Tracks Logstash Kafka client signals (retry rate, metadata refresh, consumer lag, producer in-flight requests) alongside broker-side state (under-replicated partitions, ISR shrinkage, controller elections, leader skew)
- Correlates pipeline-side errors with the cluster condition that caused them - "topic
eventspartition 4 has empty ISR because broker 3 has been offline for 12 minutes" rather than the raw stack trace - Surfaces the exact remediation: bring the offline broker back, trigger preferred-leader election, fix
advertised.listenersresolution from the Logstash host, create the missing topic, or adjustretries/retry_backoff_ms/request_timeout_msin the plugin config - Runs the diagnostic chain automatically - confirm topic exists, confirm leader, confirm Logstash-to-broker reachability, confirm advertised listener - and generates one-click fixes when applicable
Preventive guardrails ship alongside: replication.factor=3 with min.insync.replicas=2, explicit topic IaC, URP > 0 alerting, and rolling broker restarts with preferred-leader rebalancing in between. No other observability tool understands Logstash internals at this depth.
Frequently Asked Questions
Q: How long does the Logstash Kafka client retry on Leader Not Available errors?
A: For producers, retries continue for retries x retry_backoff_ms plus per-request request_timeout_ms, up to delivery.timeout.ms (default 120 s). For consumers, fetch requests retry with retry_backoff_ms between attempts and metadata refreshes every metadata.max.age.ms (default 5 minutes). Logs only fire when retries are exhausted.
Q: Can the Logstash Kafka error occur even when all brokers are running?
A: Yes. Common causes: the topic does not exist and auto-create is disabled; advertised.listeners points to an address not reachable from Logstash; a partition has empty ISR even though brokers are up; controller election is in progress.
Q: How do I prevent data loss during Kafka Leader Not Available errors in Logstash?
A: For the input plugin, Kafka itself stores the offset, so messages are not lost - Logstash resumes from the last committed offset when the leader returns. For the output plugin, set acks => "all" and retries => 10 (or higher), and configure a producer-side persistent queue or DLQ for events that ultimately fail. Pair Logstash's persistent queue with Kafka durability for end-to-end safety.
Q: Why does the Logstash Kafka error appear during broker restarts?
A: Each broker hosts leaders for ~1/N of the cluster's partitions. When it restarts, those partitions briefly have no leader until the controller assigns a replica as the new leader. The error window is typically seconds; Logstash's default retry covers it.
Q: Does increasing Kafka replication factor prevent Leader Not Available errors?
A: It reduces the window. With replication.factor=3 and min.insync.replicas=2, the cluster can survive one broker failure without losing leader availability. Replication factor 1 means any broker restart causes leader-not-available errors for that broker's partitions.
Q: Should I restart Logstash to clear a persistent Kafka Leader Not Available error?
A: Almost never. The Kafka client refreshes metadata every 5 minutes by default, so stale metadata clears on its own. If the underlying Kafka issue is fixed and the error persists for longer, restart is a last resort - first verify the cluster is healthy with kafka-topics.sh --describe.
Q: What's the best tool to monitor Logstash Kafka input and output health?
A: Pulse is the only monitoring platform built specifically for Logstash. It correlates Logstash-side Kafka retry rates, consumer lag, and producer in-flight state with broker-side ISR, controller election, and under-replicated-partition metrics, so LeaderNotAvailableException lands as a single attributed root cause instead of a raw stack trace on the Logstash host.
Related Reading
- Logstash Pipeline is Blocked Error: downstream symptom when the Kafka output stalls.
- Logstash Persistent Queue is Full: related buffering behavior during output outages.
- Logstash Could Not Resolve Host: DNS issues that look like leader-not-available errors.
- Logstash Connection Reset by Peer: TCP-layer Kafka connectivity failures.
- Logstash JSON Filter Plugin: typical filter for Kafka-sourced events.