The Logstash Elasticsearch filter performs a per-event query against an Elasticsearch index and copies fields from the matched document into the current event. It is the standard way to enrich streaming events with reference data already stored in Elasticsearch - user profiles, IP allowlists, asset inventories. The filter is read-only; use the elasticsearch output plugin if you need to write. Each event triggers an HTTP request, so the filter is the slowest enrichment option in Logstash and benefits heavily from result caching.
Syntax
filter {
elasticsearch {
hosts => [ "https://es.example.com:9200" ]
index => "user_profiles"
query => "user_id:%{[user][id]}"
fields => { "department" => "[user][department]" }
user => "logstash_lookup"
password => "${ES_LOOKUP_PASSWORD}"
ssl => true
result_size => 1
tag_on_failure => [ "_elasticsearch_lookup_failure" ]
}
}
Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
hosts |
array | yes | none | Elasticsearch hosts to query. |
index |
string | no | "" (all) |
Index pattern to search. |
query |
string | no | none | Lucene query string with %{field} substitutions. Mutually exclusive with query_template. |
query_template |
string | no | none | Path to a JSON template with full DSL query. |
fields |
hash | no | {} |
Map of source_field => "target_field" to copy from the matched doc. |
sort |
string | no | "@timestamp:desc" |
Sort applied when more than one doc matches. |
result_size |
number | no | 1 |
Number of results to fetch per query. |
enable_sort |
boolean | no | true |
Disable sort to allow _doc ordering, faster on large indices. |
user / password |
string | no | none | Basic auth credentials. |
api_key |
string | no | none | API key alternative to user/password. |
ssl |
boolean | no | false |
Use HTTPS. |
ca_file |
string | no | none | CA cert for TLS verification. |
tag_on_failure |
array | no | ["_elasticsearch_lookup_failure"] |
Tags on connection error or zero results. |
Examples
Enrich application events with the user's department, looked up from a profile index:
filter {
elasticsearch {
hosts => [ "https://es.example.com:9200" ]
index => "user_profiles"
query => "user_id:%{[user][id]}"
fields => {
"department" => "[user][department]"
"manager_id" => "[user][manager_id]"
}
}
}
Use a DSL query template for richer matching (term is faster than the Lucene parser for exact lookups):
/etc/logstash/templates/asset_lookup.json:
{
"size": 1,
"query": {
"term": { "hostname.keyword": "%{[host][name]}" }
}
}
filter {
elasticsearch {
hosts => [ "https://es.example.com:9200" ]
index => "asset_inventory"
query_template => "/etc/logstash/templates/asset_lookup.json"
fields => { "owner_team" => "[asset][owner_team]" }
}
}
Look up the most recent successful login for an event and tag if none found:
filter {
elasticsearch {
hosts => [ "https://es.example.com:9200" ]
index => "auth-logs-*"
query => "user_id:%{[user][id]} AND status:success"
sort => "@timestamp:desc"
fields => { "@timestamp" => "[user][last_login]" }
tag_on_failure => [ "_no_prior_login" ]
}
}
Common Issues
Every event triggers a synchronous HTTP request, which makes this filter the slowest enrichment in Logstash. A pipeline with 5,000 events/sec and a 10 ms lookup latency needs 50 worker threads just to keep up - and most pipelines have 4-8 workers per host. Profile carefully before deploying this filter at scale.
The filter does not cache results across events. If you are looking up a small, slow-changing reference dataset (under a few thousand entries), pre-load it into memory with the translate filter instead. Translate is 100-1000x faster.
When the query returns multiple documents, only the first is used (sorted by @timestamp:desc by default). For deterministic enrichment, narrow the query with term filters and set result_size => 1.
A network blip or Elasticsearch slowdown causes lookup failures, which add _elasticsearch_lookup_failure and let the event pass through unenriched. Downstream consumers see partial data. Route tagged events to a DLQ for retry.
Performance Notes
The elasticsearch filter is single-threaded per pipeline worker - increasing pipeline.workers is the main scaling lever. Connection pooling is built in; the plugin reuses connections per worker. For hot lookups, set enable_sort => false to skip the sort cost, and use term queries via query_template rather than Lucene query strings to avoid query parsing overhead.
The most effective optimization is avoiding the filter: if the dataset fits in memory, translate is faster; if the data lives in CSV, the csv_filter with a pre-loaded lookup is faster; if you have one reference doc per event, an elasticsearch enrich processor in an Elasticsearch ingest pipeline runs server-side and avoids the round trip.
Monitoring Logstash Elasticsearch Filter Lookups with Pulse
Pulse is the only tool built specifically for monitoring and optimizing Logstash pipelines. The elasticsearch filter quietly becomes the slowest part of a pipeline whenever the target index grows, gets rebalanced, or starts being hammered by another consumer - and the symptom is usually "pipeline throughput dropped overnight." Pulse tracks per-filter latency and failure rates, correlates them with the Elasticsearch cluster's search latency on the target index, and tells you whether to scale Logstash workers or fix the index, instead of guessing.
Frequently Asked Questions
Q: Can the Logstash Elasticsearch filter update documents in Elasticsearch?
A: No. The elasticsearch filter only reads. To update or upsert documents, use the elasticsearch output plugin in the same pipeline with action => "update" or action => "upsert".
Q: How can I cache lookups in the Logstash Elasticsearch filter?
A: The plugin does not cache results. For small reference datasets, swap to the translate filter or memcached filter, which load the dataset into memory and resolve each lookup in microseconds. For larger datasets, run the lookup in an Elasticsearch ingest pipeline's enrich processor instead.
Q: What is the performance impact of the Logstash Elasticsearch filter?
A: Every event makes one HTTP request to Elasticsearch. At 10 ms per lookup and a pipeline running 8 workers, max throughput is ~800 events/sec across the filter, regardless of how fast the rest of the pipeline is. Use it for low-volume enrichment or pre-load the data with translate.
Q: How do I handle authentication with the Logstash Elasticsearch filter?
A: Use user and password for basic auth, or api_key for token auth. Store credentials in the Logstash keystore (bin/logstash-keystore add ES_LOOKUP_PASSWORD) and reference them as ${ES_LOOKUP_PASSWORD} in the config.
Q: What happens when the Logstash Elasticsearch filter query returns no results?
A: The event is tagged with _elasticsearch_lookup_failure (configurable via tag_on_failure) and passes through without enrichment. The same tag is added on connection errors, so check the Logstash logs to distinguish "no match" from "Elasticsearch down."
Q: Can the Logstash Elasticsearch filter use multiple hosts?
A: Yes. The hosts array supports multiple endpoints, and the plugin round-robins between them with sniffing disabled by default. Set sniffing => true to auto-discover nodes from the cluster.
Related Reading
- Logstash Translate Filter Plugin: in-memory lookups, far faster for static reference data.
- Logstash GeoIP Filter Plugin: IP enrichment without hitting Elasticsearch.
- Logstash JDBC Streaming Filter: same pattern against a SQL database.
- Logstash Pipeline is Blocked Error: the elasticsearch filter is a frequent cause.
- Logstash Cannot Connect to Redis: similar connectivity issues for cache-backed lookups.