The Logstash split filter takes one event and emits multiple events: one per element of an array field, or one per substring when splitting a string by a delimiter. The original event is consumed. Every emitted event inherits all other fields from the parent, then has the split field replaced with a single element. Use it after a json filter to turn {"items":[a,b,c]} into three separate events, one per item.
Syntax
filter {
split {
field => "items"
target => "item"
terminator => "\n"
add_tag => [ "split_event" ]
}
}
When field is an array, the filter emits one event per element. When field is a string and terminator is set, the string is split by terminator and one event is emitted per substring. The two modes are mutually exclusive in practice - pick one based on the input shape.
Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
field |
string | no | message |
Source field to split. Can be an array or a string. |
target |
string | no | same as field |
Destination field for each individual element. When splitting an array, the array field is overwritten with the single element unless target is set. |
terminator |
string | no | \n |
Delimiter for string mode. Ignored when field is an array. |
add_field |
hash | no | {} |
Fields to add to each split event. |
add_tag |
array | no | [] |
Tags to add to each split event. |
remove_field |
array | no | [] |
Fields to remove from each split event. |
Examples
Explode a JSON array into one event per element. The json filter populates the events array, and split emits one event per item:
filter {
json {
source => "message"
}
split {
field => "[events]"
}
}
If the input is {"events":[{"id":1},{"id":2}]}, two events flow out: one with events => {"id":1} and one with events => {"id":2}. Both retain @timestamp, host, and tags from the parent.
Split a multiline string into one event per line:
filter {
split {
field => "raw_text"
terminator => "\n"
}
}
Use target to keep the array under a different name and free the original field:
filter {
split {
field => "[order][line_items]"
target => "[order][line_item]"
}
}
After this, each emitted event has order.line_item set to one element, and order.line_items removed.
Common Issues
The split filter is non-trivially expensive: one input event produces N output events, each of which is a deep copy. A 1000-element array produces 1000 events that all flow through the rest of the pipeline. If your downstream output is Elasticsearch, this means 1000 indexing operations per source event. Cap input size upstream or split selectively with a conditional.
When the source field is missing, empty, or not an array/string, split is a no-op and the original event passes through unchanged. There is no error or tag - check your config with a small sample before deploying.
All split events share the same @timestamp, which is fine for batch documents but loses ordering precision when timestamps need to be unique. To preserve order, follow split with a ruby filter that adds a microsecond offset per event.
Metadata fields ([@metadata]) are inherited by every split event. If downstream filters use metadata to route events, the routing applies to every split child. Clear or modify metadata before splitting if that is a problem.
Performance Notes
Split is one of the few Logstash filters that can multiply throughput requirements. A pipeline that handles 1000 input events/sec and splits each into 10 children produces 10,000 events/sec for downstream filters and outputs - including the persistent queue. Size your queue, batch settings, and Elasticsearch bulk request size for the post-split rate, not the input rate.
For very large arrays (10,000+ elements per event), consider doing the split server-side: Elasticsearch ingest pipelines have a foreach processor that iterates over an array without exploding to per-element documents in Logstash. This keeps Logstash throughput stable.
Monitoring Logstash Split Pipelines with Pulse
Pulse is the only tool built specifically for monitoring and optimizing Logstash pipelines. The split filter is the easiest way to accidentally 100x downstream load - a config change that splits a previously-flat field can fill the persistent queue and saturate Elasticsearch in minutes. Pulse tracks the input-vs-output event ratio per pipeline, alerts on sudden multiplier changes, and correlates the spike with the upstream payload size so you can see exactly what changed and roll back.
Frequently Asked Questions
Q: How does the Logstash split filter handle multi-character delimiters?
A: The terminator parameter accepts any string, including multi-character delimiters. For regex-based splitting, the split filter does not support patterns - run a ruby filter that does event.get("field").split(/regex/) first, then split on the resulting array.
Q: Does the Logstash split filter preserve the original event timestamp?
A: Yes - all emitted child events inherit @timestamp from the parent. If you need unique per-child timestamps, set them in a ruby filter after split.
Q: Can the Logstash split filter create empty events?
A: If the array contains an empty string or null element, that element produces an event with the field set to an empty value. The filter does not skip empty elements; filter them out beforehand with a conditional or a mutate strip.
Q: What is the difference between the split filter and the split codec?
A: The split codec runs at input time and splits a stream into events by newline (or any terminator). The split filter runs per event and produces multiple output events from one input. Use the codec for line-oriented inputs (file, tcp) and the filter for structured payloads (JSON arrays).
Q: How do I handle nested JSON arrays with the Logstash split filter?
A: Run split twice: first on the outer array, then on the inner array. Each split is a separate filter block. Be aware that two levels of split multiply downstream load: 10 outer x 10 inner = 100 events per source event.
Q: Does the Logstash split filter affect downstream throughput?
A: Yes, substantially. The post-split event rate is the rate the rest of the pipeline must handle. Size your pipeline.batch.size, queue.max_bytes, and Elasticsearch bulk settings against the post-split rate.
Related Reading
- Logstash JSON Filter Plugin: typical upstream filter that produces the array to split.
- Logstash Aggregate Filter Plugin: the inverse - combine multiple events into one.
- Logstash Clone Filter Plugin: copy events to multiple pipelines without splitting an array.
- Logstash Persistent Queue is Full: split can fill the queue if oversized.
- Logstash Pipeline is Blocked Error: split multiplies downstream load.