NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

Elasticsearch percolator Field Data Type

The Elasticsearch percolator field type stores Elasticsearch Query DSL bodies as indexed documents. At search time, a percolate query takes an incoming document and returns every stored query that matches it - the reverse of normal search, where one query matches many documents. The pattern fits alerting, content classification, saved-search notifications, and any workload where a relatively stable set of queries is evaluated against an ever-changing stream of documents.

How the Percolator Field Works

A percolator-typed field accepts an Elasticsearch query JSON object. When you index a document with such a field, Elasticsearch parses the query, extracts the unique terms it would match (a "query terms" index), and stores the original query bytes alongside. A percolate query at search time first uses the terms index to prune the candidate stored queries (only queries whose terms can possibly match the incoming document survive), then evaluates each surviving query against the incoming document to confirm.

The percolator's efficiency depends on the term-extraction phase eliminating most stored queries before per-document re-evaluation. For straightforward match, term, and bool queries the candidate set shrinks dramatically. Queries with wildcards, regular expressions, or scripts cannot have their terms extracted and are checked against every incoming document - the cost is closer to running all stored queries serially.

Example

PUT /alerts
{
  "mappings": {
    "properties": {
      "query":    { "type": "percolator" },
      "headline": { "type": "text" },
      "body":     { "type": "text" }
    }
  }
}

POST /alerts/_doc/1
{
  "query": {
    "match": { "body": "kubernetes outage" }
  }
}

GET /alerts/_search
{
  "query": {
    "percolate": {
      "field": "query",
      "document": {
        "headline": "Major cluster downtime reported",
        "body": "Engineers report a kubernetes outage affecting two regions"
      }
    }
  }
}

The search returns stored query 1 because its match clause matches the document's body field.

Percolator Field Configuration and Variants

Variant Use case Notes
percolate query with inline document One-shot matching Send the document directly in the query body
percolate query with index/id Match a document already in another index Avoids resending document bytes
Batched percolate (documents array) Score many docs against the same stored queries Reduces per-doc overhead
Highlighting on percolator matches Visualize which clause of the stored query matched Use _percolator_document_slot to identify each input doc in a batch

Field-level config: the percolator field accepts no parameters beyond the mapping type. Indices that hold percolator fields should mirror the mapping of the documents you intend to match - the percolator stores the source mapping for fields referenced by stored queries so they can be evaluated correctly.

Common Pitfalls with the Percolator

  1. Unbounded stored-query growth. Each saved query is a document; tens of thousands of them in a single index works, hundreds of thousands begin to push merge and search budgets. Plan rollover or tiered storage.
  2. Wildcard/regex/script queries bypass term extraction, forcing per-document evaluation against every stored query. Avoid them in high-throughput percolator workloads.
  3. Mapping drift between the percolator index and the source index. The stored queries must reference fields with the types and analyzers they assumed at storage time. Changing those after the fact silently breaks matches.
  4. Forgetting that the percolator index needs the source mapping to interpret stored queries that reference fields it does not hold documents for. Copy the mappings of fields used by stored queries into the percolator index's mapping.
  5. Highlighting expectations. Percolator highlights show which clause of the stored query matched, not the inverse. Highlighting on the incoming document requires manual extraction.

Operating a Percolator Workload

Percolator performance is sensitive to the same factors as normal search: heap pressure, segment count, and shard sizing. With stored-query indices of moderate size (tens of thousands), normal Elasticsearch monitoring metrics (search latency, heap, segment count) cover it. Beyond that, watch the percolate query's took time and the ratio of candidate queries (selected by term extraction) to actual matches.

For Elasticsearch and OpenSearch clusters running percolator workloads, Pulse tracks per-percolator-index query latency, candidate-set growth, and stored-query count over time. Pulse's monitoring flags slowdowns from mapping drift, wildcard-query buildup, or oversized percolator indices before they affect downstream alerting pipelines.

Frequently Asked Questions

Q: What is the percolator field type used for in Elasticsearch?
A: The percolator field stores Elasticsearch queries as indexed documents so you can match incoming documents against a saved query set - the inverse of normal search. Typical uses are real-time alerting, content classification, saved-search notifications, and rule engines where the rules (queries) outnumber the events being checked.

Q: How does the percolate query find matching saved queries efficiently?
A: When you index a query into a percolator field, Elasticsearch extracts the terms it would match and stores them as a normal Lucene terms index. At percolate time, the incoming document's terms are used to prune the candidate stored-query set - only queries whose required terms appear in the document are re-evaluated. Wildcard, regex, and script queries cannot be pruned this way.

Q: How many stored queries can I keep in a percolator index?
A: Tens of thousands work without special tuning on a standard hot-tier shard. Beyond ~100,000, plan to shard the percolator index across multiple indices and tune index.queries.cache.enabled and segment counts. Avoid wildcard/regex stored queries at high scale - they degrade percolate latency linearly.

Q: Can I combine percolate with other queries in a single search?
A: Yes. Wrap percolate inside bool.must or bool.filter and add other clauses normally. This is useful for narrowing the candidate stored queries by metadata (e.g., only "active" rules) before percolating.

Q: Do I need to copy the source document's mapping into the percolator index?
A: For any field referenced by a stored query, yes. The percolator must know the field's type and analyzer to evaluate the stored query against an incoming document. Use a component template or shared mapping source to keep them in sync.

Q: How do I delete or update a stored query?
A: A stored query is a normal document in the percolator index. Update it via the _update API or delete by _id. There is no special API for stored queries beyond standard CRUD.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.