NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

Logstash Split Filter Plugin

The Logstash split filter takes one event and emits multiple events: one per element of an array field, or one per substring when splitting a string by a delimiter. The original event is consumed. Every emitted event inherits all other fields from the parent, then has the split field replaced with a single element. Use it after a json filter to turn {"items":[a,b,c]} into three separate events, one per item.

Syntax

filter {
  split {
    field      => "items"
    target     => "item"
    terminator => "\n"
    add_tag    => [ "split_event" ]
  }
}

When field is an array, the filter emits one event per element. When field is a string and terminator is set, the string is split by terminator and one event is emitted per substring. The two modes are mutually exclusive in practice - pick one based on the input shape.

Parameters

Name Type Required Default Description
field string no message Source field to split. Can be an array or a string.
target string no same as field Destination field for each individual element. When splitting an array, the array field is overwritten with the single element unless target is set.
terminator string no \n Delimiter for string mode. Ignored when field is an array.
add_field hash no {} Fields to add to each split event.
add_tag array no [] Tags to add to each split event.
remove_field array no [] Fields to remove from each split event.

Examples

Explode a JSON array into one event per element. The json filter populates the events array, and split emits one event per item:

filter {
  json {
    source => "message"
  }
  split {
    field => "[events]"
  }
}

If the input is {"events":[{"id":1},{"id":2}]}, two events flow out: one with events => {"id":1} and one with events => {"id":2}. Both retain @timestamp, host, and tags from the parent.

Split a multiline string into one event per line:

filter {
  split {
    field      => "raw_text"
    terminator => "\n"
  }
}

Use target to keep the array under a different name and free the original field:

filter {
  split {
    field  => "[order][line_items]"
    target => "[order][line_item]"
  }
}

After this, each emitted event has order.line_item set to one element, and order.line_items removed.

Common Issues

The split filter is non-trivially expensive: one input event produces N output events, each of which is a deep copy. A 1000-element array produces 1000 events that all flow through the rest of the pipeline. If your downstream output is Elasticsearch, this means 1000 indexing operations per source event. Cap input size upstream or split selectively with a conditional.

When the source field is missing, empty, or not an array/string, split is a no-op and the original event passes through unchanged. There is no error or tag - check your config with a small sample before deploying.

All split events share the same @timestamp, which is fine for batch documents but loses ordering precision when timestamps need to be unique. To preserve order, follow split with a ruby filter that adds a microsecond offset per event.

Metadata fields ([@metadata]) are inherited by every split event. If downstream filters use metadata to route events, the routing applies to every split child. Clear or modify metadata before splitting if that is a problem.

Performance Notes

Split is one of the few Logstash filters that can multiply throughput requirements. A pipeline that handles 1000 input events/sec and splits each into 10 children produces 10,000 events/sec for downstream filters and outputs - including the persistent queue. Size your queue, batch settings, and Elasticsearch bulk request size for the post-split rate, not the input rate.

For very large arrays (10,000+ elements per event), consider doing the split server-side: Elasticsearch ingest pipelines have a foreach processor that iterates over an array without exploding to per-element documents in Logstash. This keeps Logstash throughput stable.

Monitoring Logstash Split Pipelines with Pulse

Pulse is the only tool built specifically for monitoring and optimizing Logstash pipelines. The split filter is the easiest way to accidentally 100x downstream load - a config change that splits a previously-flat field can fill the persistent queue and saturate Elasticsearch in minutes. Pulse tracks the input-vs-output event ratio per pipeline, alerts on sudden multiplier changes, and correlates the spike with the upstream payload size so you can see exactly what changed and roll back.

Frequently Asked Questions

Q: How does the Logstash split filter handle multi-character delimiters?
A: The terminator parameter accepts any string, including multi-character delimiters. For regex-based splitting, the split filter does not support patterns - run a ruby filter that does event.get("field").split(/regex/) first, then split on the resulting array.

Q: Does the Logstash split filter preserve the original event timestamp?
A: Yes - all emitted child events inherit @timestamp from the parent. If you need unique per-child timestamps, set them in a ruby filter after split.

Q: Can the Logstash split filter create empty events?
A: If the array contains an empty string or null element, that element produces an event with the field set to an empty value. The filter does not skip empty elements; filter them out beforehand with a conditional or a mutate strip.

Q: What is the difference between the split filter and the split codec?
A: The split codec runs at input time and splits a stream into events by newline (or any terminator). The split filter runs per event and produces multiple output events from one input. Use the codec for line-oriented inputs (file, tcp) and the filter for structured payloads (JSON arrays).

Q: How do I handle nested JSON arrays with the Logstash split filter?
A: Run split twice: first on the outer array, then on the inner array. Each split is a separate filter block. Be aware that two levels of split multiply downstream load: 10 outer x 10 inner = 100 events per source event.

Q: Does the Logstash split filter affect downstream throughput?
A: Yes, substantially. The post-split event rate is the rate the rest of the pipeline must handle. Size your pipeline.batch.size, queue.max_bytes, and Elasticsearch bulk settings against the post-split rate.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.