NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

Logstash Elasticsearch Filter Plugin

The Logstash Elasticsearch filter performs a per-event query against an Elasticsearch index and copies fields from the matched document into the current event. It is the standard way to enrich streaming events with reference data already stored in Elasticsearch - user profiles, IP allowlists, asset inventories. The filter is read-only; use the elasticsearch output plugin if you need to write. Each event triggers an HTTP request, so the filter is the slowest enrichment option in Logstash and benefits heavily from result caching.

Syntax

filter {
  elasticsearch {
    hosts          => [ "https://es.example.com:9200" ]
    index          => "user_profiles"
    query          => "user_id:%{[user][id]}"
    fields         => { "department" => "[user][department]" }
    user           => "logstash_lookup"
    password       => "${ES_LOOKUP_PASSWORD}"
    ssl            => true
    result_size    => 1
    tag_on_failure => [ "_elasticsearch_lookup_failure" ]
  }
}

Parameters

Name Type Required Default Description
hosts array yes none Elasticsearch hosts to query.
index string no "" (all) Index pattern to search.
query string no none Lucene query string with %{field} substitutions. Mutually exclusive with query_template.
query_template string no none Path to a JSON template with full DSL query.
fields hash no {} Map of source_field => "target_field" to copy from the matched doc.
sort string no "@timestamp:desc" Sort applied when more than one doc matches.
result_size number no 1 Number of results to fetch per query.
enable_sort boolean no true Disable sort to allow _doc ordering, faster on large indices.
user / password string no none Basic auth credentials.
api_key string no none API key alternative to user/password.
ssl boolean no false Use HTTPS.
ca_file string no none CA cert for TLS verification.
tag_on_failure array no ["_elasticsearch_lookup_failure"] Tags on connection error or zero results.

Examples

Enrich application events with the user's department, looked up from a profile index:

filter {
  elasticsearch {
    hosts  => [ "https://es.example.com:9200" ]
    index  => "user_profiles"
    query  => "user_id:%{[user][id]}"
    fields => {
      "department"  => "[user][department]"
      "manager_id"  => "[user][manager_id]"
    }
  }
}

Use a DSL query template for richer matching (term is faster than the Lucene parser for exact lookups):

/etc/logstash/templates/asset_lookup.json:

{
  "size": 1,
  "query": {
    "term": { "hostname.keyword": "%{[host][name]}" }
  }
}
filter {
  elasticsearch {
    hosts          => [ "https://es.example.com:9200" ]
    index          => "asset_inventory"
    query_template => "/etc/logstash/templates/asset_lookup.json"
    fields         => { "owner_team" => "[asset][owner_team]" }
  }
}

Look up the most recent successful login for an event and tag if none found:

filter {
  elasticsearch {
    hosts  => [ "https://es.example.com:9200" ]
    index  => "auth-logs-*"
    query  => "user_id:%{[user][id]} AND status:success"
    sort   => "@timestamp:desc"
    fields => { "@timestamp" => "[user][last_login]" }
    tag_on_failure => [ "_no_prior_login" ]
  }
}

Common Issues

Every event triggers a synchronous HTTP request, which makes this filter the slowest enrichment in Logstash. A pipeline with 5,000 events/sec and a 10 ms lookup latency needs 50 worker threads just to keep up - and most pipelines have 4-8 workers per host. Profile carefully before deploying this filter at scale.

The filter does not cache results across events. If you are looking up a small, slow-changing reference dataset (under a few thousand entries), pre-load it into memory with the translate filter instead. Translate is 100-1000x faster.

When the query returns multiple documents, only the first is used (sorted by @timestamp:desc by default). For deterministic enrichment, narrow the query with term filters and set result_size => 1.

A network blip or Elasticsearch slowdown causes lookup failures, which add _elasticsearch_lookup_failure and let the event pass through unenriched. Downstream consumers see partial data. Route tagged events to a DLQ for retry.

Performance Notes

The elasticsearch filter is single-threaded per pipeline worker - increasing pipeline.workers is the main scaling lever. Connection pooling is built in; the plugin reuses connections per worker. For hot lookups, set enable_sort => false to skip the sort cost, and use term queries via query_template rather than Lucene query strings to avoid query parsing overhead.

The most effective optimization is avoiding the filter: if the dataset fits in memory, translate is faster; if the data lives in CSV, the csv_filter with a pre-loaded lookup is faster; if you have one reference doc per event, an elasticsearch enrich processor in an Elasticsearch ingest pipeline runs server-side and avoids the round trip.

Monitoring Logstash Elasticsearch Filter Lookups with Pulse

Pulse is the only tool built specifically for monitoring and optimizing Logstash pipelines. The elasticsearch filter quietly becomes the slowest part of a pipeline whenever the target index grows, gets rebalanced, or starts being hammered by another consumer - and the symptom is usually "pipeline throughput dropped overnight." Pulse tracks per-filter latency and failure rates, correlates them with the Elasticsearch cluster's search latency on the target index, and tells you whether to scale Logstash workers or fix the index, instead of guessing.

Frequently Asked Questions

Q: Can the Logstash Elasticsearch filter update documents in Elasticsearch?
A: No. The elasticsearch filter only reads. To update or upsert documents, use the elasticsearch output plugin in the same pipeline with action => "update" or action => "upsert".

Q: How can I cache lookups in the Logstash Elasticsearch filter?
A: The plugin does not cache results. For small reference datasets, swap to the translate filter or memcached filter, which load the dataset into memory and resolve each lookup in microseconds. For larger datasets, run the lookup in an Elasticsearch ingest pipeline's enrich processor instead.

Q: What is the performance impact of the Logstash Elasticsearch filter?
A: Every event makes one HTTP request to Elasticsearch. At 10 ms per lookup and a pipeline running 8 workers, max throughput is ~800 events/sec across the filter, regardless of how fast the rest of the pipeline is. Use it for low-volume enrichment or pre-load the data with translate.

Q: How do I handle authentication with the Logstash Elasticsearch filter?
A: Use user and password for basic auth, or api_key for token auth. Store credentials in the Logstash keystore (bin/logstash-keystore add ES_LOOKUP_PASSWORD) and reference them as ${ES_LOOKUP_PASSWORD} in the config.

Q: What happens when the Logstash Elasticsearch filter query returns no results?
A: The event is tagged with _elasticsearch_lookup_failure (configurable via tag_on_failure) and passes through without enrichment. The same tag is added on connection errors, so check the Logstash logs to distinguish "no match" from "Elasticsearch down."

Q: Can the Logstash Elasticsearch filter use multiple hosts?
A: Yes. The hosts array supports multiple endpoints, and the plugin round-robins between them with sniffing disabled by default. Set sniffing => true to auto-discover nodes from the cluster.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.