What is the Elasticsearch Query Cache? Node-Level Filter Caching Explained

The Elasticsearch query cache (formally the node query cache) is a per-node LRU cache of bitset results for filter clauses, keyed by the filter and the Lucene segment it ran against. When a query reuses the same filter against the same segment, Elasticsearch returns the cached docID bitset instead of re-evaluating. The cache is enabled by default, sized at 10% of JVM heap, and lives entirely on the heap.

How the Query Cache Works

When a search arrives, Elasticsearch decomposes it into clauses. Filter clauses (those that don't contribute to relevance scoring - typically inside filter or must_not of a bool query) are candidates for caching. The cache key is the combination of the filter definition and the segment ID; the cached value is the docID bitset that the filter produced against that segment.

Caching has heuristics built in. Filters under a certain cost threshold are not cached (the overhead exceeds the benefit). Segments with fewer than 10,000 documents or smaller than 3% of the index size are skipped. These thresholds keep the cache focused on filters that are both expensive and reused.

Crucially, the query cache only caches filters, not full query results. That role belongs to the shard request cache, which caches the aggregation/hits result for the entire request.

Query Cache vs Request Cache vs Fielddata Cache

Cache Caches Scope Default Key
Node query cache Filter bitsets Per node, per segment 10% of heap Filter + segment
Shard request cache Aggregation results, hits.total, size: 0 queries Per shard 1% of heap Full request body
Fielddata cache Field values for sorting/aggs on text fields Per node 40% breaker limit Field + segment

The three caches interact: a size: 0 aggregation query may hit the request cache (full result) or fall through to the query cache (filter bitsets) on a miss. Most production tuning focuses on the request cache for dashboard queries and the query cache for high-volume filtered searches.

Configuration

Setting Default Description
indices.queries.cache.size 10% Node-wide cache size (heap %)
indices.queries.cache.count 10000 Max number of cached entries
index.queries.cache.enabled true Per-index enable/disable

Change the per-node size in elasticsearch.yml:

indices.queries.cache.size: 15%

This is a static setting - it requires a node restart. Disable caching for a specific index (rare):

PUT /my-index/_settings
{
  "index.queries.cache.enabled": false
}

You can also disable the cache for a single search via ?request_cache=false (request cache) or by avoiding cacheable filter clauses for the query cache.

When to Tune the Query Cache

The default 10% of heap fits most workloads. Cases where tuning helps:

  • Hit ratio is consistently low (<20%). Either the filters aren't being reused, or the cache is too small. Check indices.query_cache stats per node.
  • High eviction rate with high traffic suggests the working set is bigger than the cache. Bumping to 15-20% can help, at the cost of less heap for queries.
  • Frequent refresh on a hot index. Each refresh creates new segments, and cached bitsets for old segments become useless as merges retire them. Short refresh intervals plus large indices = low cache effectiveness. See refresh interval.

Don't tune the cache to "fix" slow queries that don't benefit from filter caching (sorting, aggregations on high-cardinality fields, queries that re-score with different scoring functions). Use the shard request cache and proper mapping instead.

Monitoring Query Cache Performance

Check stats via the indices stats API:

GET /_nodes/stats/indices/query_cache
GET /_stats/query_cache?human

Key metrics:

  • hit_count / miss_count - hit ratio = hit / (hit + miss). >50% indicates the cache is earning its heap.
  • cache_size and cache_count - current entries.
  • evictions - if growing rapidly, cache is undersized for the workload.
  • memory_size_in_bytes - actual memory used.

Pulse tracks query cache hit ratios, evictions, and memory pressure per node, and correlates cache misses with the queries that drive them. Pulse's automated analysis spots patterns where a workload is paying for cache space it doesn't benefit from - and surfaces the specific filter clauses that would gain from better caching or rewriting.

Common Query Cache Pitfalls

  1. Assuming the cache speeds up every query. It only helps reusable filter clauses on segments that meet the size thresholds.
  2. Tuning cache size up to compensate for hot-shard imbalance. The fix is shard placement, not bigger caches.
  3. Conflating query cache with shard request cache. They cache different things; tuning the wrong one wastes effort.
  4. Forgetting that refresh invalidates the cache for segments that get merged away. Long refresh intervals make the cache more effective on append-only workloads.
  5. Putting now (or anything time-dependent) inside a filter. range with now is non-deterministic and won't cache effectively; round to a fixed boundary (now/m, now/h) so the same key recurs.

Frequently Asked Questions

Q: Is the Elasticsearch query cache enabled by default?
A: Yes. The node query cache is enabled by default, sized at 10% of JVM heap. Individual indices can disable it via index.queries.cache.enabled: false, but this is rarely necessary.

Q: What is the difference between the query cache and the request cache?
A: The query cache stores filter clause bitsets per segment, reusable across different queries that share the same filter. The shard request cache stores entire query results for size: 0 and aggregation requests. They're complementary, not interchangeable.

Q: How do I check Elasticsearch query cache hit ratio?
A: GET /_nodes/stats/indices/query_cache returns hit_count and miss_count per node. Hit ratio = hit / (hit + miss). A healthy ratio is typically 30-70% depending on workload; <20% suggests the cache isn't helping much.

Q: How much memory does the Elasticsearch query cache use?
A: Up to indices.queries.cache.size (default 10% of JVM heap), capped by indices.queries.cache.count (default 10,000 entries). The cache uses heap memory, so size matters when the heap is also serving fielddata, aggregations, and active queries.

Q: Why do my filters not get cached in Elasticsearch?
A: Three common reasons: (1) the segment is smaller than 10,000 docs or less than 3% of the index, (2) the filter is too cheap (Elasticsearch's heuristics skip cheap filters), or (3) the filter is non-deterministic (e.g., range with now). Round time filters to a coarser boundary like now/m to make them cacheable.

Q: When should I disable the Elasticsearch query cache?
A: Almost never. The default heuristics avoid caching unprofitable filters. Disable per-index only if you've measured that the cache is consuming heap without delivering hits - usually a sign of low filter reuse, not a cache problem to fix by disabling.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.