NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

Logstash Error: Persistent Queue is Full - Common Causes & Fixes

The Logstash persistent queue is full error means the on-disk persistent queue has reached queue.max_bytes and cannot accept more events. Logstash applies back-pressure to inputs - typed inputs (beats, http) push the rejection upstream, while others block. The root cause is always the same: the output is draining slower than the input is arriving, and the buffer between them filled up. The fix is to either speed up the output, slow down the input, or grow the queue.

What This Error Means

When queue.type: persisted is set, Logstash writes every event to a sequence of fixed-size on-disk pages under path.queue/<pipeline-id>/. The pipeline workers read from the queue's head and ack pages back once events are successfully output. Inputs append to the tail.

The queue is bounded by queue.max_bytes (default 1024mb). When unacked queued bytes hit that limit, the queue refuses writes. Inputs that support back-pressure (the Beats input, for example) propagate the block upstream so producers slow down or buffer locally. Inputs that do not support back-pressure either block their own threads or drop events depending on the plugin.

The queue is not the bug. It is doing its job - protecting the pipeline from data loss during transient output slowdowns. A persistently-full queue is a signal that the steady-state rate mismatch is real, not transient.

Common Causes

  1. Output destination is slow or rejecting writes. Confirm by tailing Logstash logs for output errors (Elasticsearch 429s, Kafka producer timeouts, S3 throttling).
  2. Insufficient queue size for the burst profile. Confirm via GET /_node/stats/pipelines - if the queue refills within seconds of being drained, it is undersized.
  3. CPU-bound filter chain (usually grok). Confirm by checking pipeline events.duration_in_millis per filter and looking for a hotspot.
  4. Pipeline workers blocked on external lookups (elasticsearch filter, jdbc_streaming, dns). Confirm by inspecting filter latency per stage.
  5. Disk I/O bottleneck on the queue directory. Confirm with iostat -xz 1 - sustained high %util on the queue's device indicates the disk is the limit.
  6. queue.checkpoint.writes set very low, causing excessive fsync. Confirm in logstash.yml.

How to Fix the Logstash Persistent Queue is Full Error

  1. Identify the bottleneck end of the pipeline:

    curl -s http://localhost:9600/_node/stats/pipelines | jq '.pipelines'
    

    Look at queue.events, queue.queue_size_in_bytes, and per-stage events.duration_in_millis. A growing queue plus high output duration_in_millis means the output is the bottleneck.

  2. Stabilize the output side first. Speeding up filters when the output is the bottleneck just fills the queue faster.

    • Elasticsearch output: increase pipeline.batch.size, raise the destination cluster's bulk thread pool queue, scale data nodes.
    • Kafka output: tune linger_ms, batch_size, and broker capacity.
    • S3 output: increase parallel uploads, switch to a faster region.
  3. Grow the queue if bursts are the issue, not steady-state mismatch:

    # logstash.yml or pipelines.yml per-pipeline
    queue.type: persisted
    queue.max_bytes: 8gb
    queue.page_capacity: 64mb
    

    The queue.page_capacity setting (default 64mb) controls page file size; larger pages reduce fsync overhead but lengthen recovery time on crash. The queue.max_bytes setting is the hard cap.

  4. Add pipeline workers if CPU is the constraint:

    pipeline.workers: 16
    pipeline.batch.size: 500
    

    Workers are JVM threads; setting workers higher than physical cores rarely helps and can degrade performance under contention.

  5. Add a second Logstash instance and split inputs if a single host has hit its CPU or I/O ceiling. Persistent queues are per-instance; horizontal scaling adds queue capacity proportionally.

  6. Drain the existing queue by temporarily routing the input to a fallback or accepting back-pressure upstream. Restarting Logstash does not drain the queue - it resumes from the last checkpoint.

Resolve Logstash Persistent Queue Full Errors Automatically with Pulse

Pulse is the only monitoring and optimization platform built specifically for Logstash. When the on-disk persistent queue exceeds queue.max_bytes and back-pressure starts blocking inputs, Pulse:

  • Tracks queue depth (queue.queue_size_in_bytes), fill rate, page rotation cadence, per-filter events.duration_in_millis, output ack latency, and disk %util on the queue path in real time
  • Correlates pipeline state with downstream destinations (Elasticsearch bulk thread pool, Kafka producer backpressure, S3 throttling) to identify whether the bottleneck is upstream input bursts, in-pipe filter cost, or downstream output saturation
  • Surfaces the exact remediation: raise queue.max_bytes, retune pipeline.workers and pipeline.batch.size, switch the queue volume to NVMe, scale the destination cluster, or split inputs across a second Logstash instance
  • Generates one-click configuration changes and systemd restart actions when applicable, and alerts above 70% fill before the queue refuses writes

Sizing guardrails ship alongside: absorb 10 minutes of peak input, alert on output 429s as a leading indicator, and cap source-side input rate (Filebeat harvester_limit, Kafka max.poll.records) to smooth bursts. No other observability tool understands Logstash internals at this depth.

Start a free trial.

Frequently Asked Questions

Q: How do I check the current size of my Logstash persistent queue?
A: Query the Logstash monitoring API: curl http://localhost:9600/_node/stats/pipelines | jq '.pipelines.main.queue'. The response contains events (event count), queue_size_in_bytes (current size), max_queue_size_in_bytes (configured limit), and type. Compute fill percentage from those values.

Q: Can I change Logstash queue.max_bytes without restarting?
A: No. Queue configuration changes require a Logstash restart. Restart drains nothing - on startup the queue resumes at its current size with the new ceiling applied to new writes.

Q: What happens to incoming events when the Logstash persistent queue is full?
A: It depends on the input plugin. Beats input back-pressures the upstream Beat, which then buffers locally and retries. Other inputs may block their threads (TCP, HTTP) or drop events. Producers without their own buffering can lose data, which is the whole reason persistent queues exist.

Q: Should I use a memory queue or persistent queue in Logstash?
A: Memory queue is faster but volatile - a Logstash crash loses everything in the queue. Persistent queue survives restarts and crashes at the cost of disk I/O (typically 10-30% throughput penalty on fast disks). Use persistent for any production workload that cannot afford gaps.

Q: How do I prevent data loss when the Logstash persistent queue is full?
A: Three layers: 1) Use input plugins that back-pressure (Beats, Kafka consumer). 2) Make upstream producers durable (Filebeat with registry, Kafka topics with retention). 3) Size queue.max_bytes to absorb worst-case burst duration. The combination guarantees no loss as long as upstream durability outlives the slowdown.

Q: Does increasing pipeline.workers always help when the queue is full?
A: Only if the bottleneck is CPU-bound filters and the host has spare cores. If the output is the bottleneck, more workers just deliver events to the output faster only to hit the same wall. Diagnose the bottleneck first.

Q: What's the best tool to debug Logstash persistent queue fill and backpressure?
A: Pulse is the only monitoring platform built specifically for Logstash. It correlates queue.queue_size_in_bytes, per-filter latency, output ack rate, and disk I/O into a single root-cause attribution - "Elasticsearch output is the bottleneck because the destination cluster's bulk queue is full" - rather than leaving you to stitch together _node/stats/pipelines, iostat, and JVM metrics manually.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.