The Elasticsearch query cache (formally the node query cache) is a per-node LRU cache of bitset results for filter clauses, keyed by the filter and the Lucene segment it ran against. When a query reuses the same filter against the same segment, Elasticsearch returns the cached docID bitset instead of re-evaluating. The cache is enabled by default, sized at 10% of JVM heap, and lives entirely on the heap.
How the Query Cache Works
When a search arrives, Elasticsearch decomposes it into clauses. Filter clauses (those that don't contribute to relevance scoring - typically inside filter or must_not of a bool query) are candidates for caching. The cache key is the combination of the filter definition and the segment ID; the cached value is the docID bitset that the filter produced against that segment.
Caching has heuristics built in. Filters under a certain cost threshold are not cached (the overhead exceeds the benefit). Segments with fewer than 10,000 documents or smaller than 3% of the index size are skipped. These thresholds keep the cache focused on filters that are both expensive and reused.
Crucially, the query cache only caches filters, not full query results. That role belongs to the shard request cache, which caches the aggregation/hits result for the entire request.
Query Cache vs Request Cache vs Fielddata Cache
| Cache | Caches | Scope | Default | Key |
|---|---|---|---|---|
| Node query cache | Filter bitsets | Per node, per segment | 10% of heap | Filter + segment |
| Shard request cache | Aggregation results, hits.total, size: 0 queries |
Per shard | 1% of heap | Full request body |
| Fielddata cache | Field values for sorting/aggs on text fields | Per node | 40% breaker limit | Field + segment |
The three caches interact: a size: 0 aggregation query may hit the request cache (full result) or fall through to the query cache (filter bitsets) on a miss. Most production tuning focuses on the request cache for dashboard queries and the query cache for high-volume filtered searches.
Configuration
| Setting | Default | Description |
|---|---|---|
indices.queries.cache.size |
10% |
Node-wide cache size (heap %) |
indices.queries.cache.count |
10000 |
Max number of cached entries |
index.queries.cache.enabled |
true |
Per-index enable/disable |
Change the per-node size in elasticsearch.yml:
indices.queries.cache.size: 15%
This is a static setting - it requires a node restart. Disable caching for a specific index (rare):
PUT /my-index/_settings
{
"index.queries.cache.enabled": false
}
You can also disable the cache for a single search via ?request_cache=false (request cache) or by avoiding cacheable filter clauses for the query cache.
When to Tune the Query Cache
The default 10% of heap fits most workloads. Cases where tuning helps:
- Hit ratio is consistently low (<20%). Either the filters aren't being reused, or the cache is too small. Check
indices.query_cachestats per node. - High eviction rate with high traffic suggests the working set is bigger than the cache. Bumping to 15-20% can help, at the cost of less heap for queries.
- Frequent refresh on a hot index. Each refresh creates new segments, and cached bitsets for old segments become useless as merges retire them. Short refresh intervals plus large indices = low cache effectiveness. See refresh interval.
Don't tune the cache to "fix" slow queries that don't benefit from filter caching (sorting, aggregations on high-cardinality fields, queries that re-score with different scoring functions). Use the shard request cache and proper mapping instead.
Monitoring Query Cache Performance
Check stats via the indices stats API:
GET /_nodes/stats/indices/query_cache
GET /_stats/query_cache?human
Key metrics:
hit_count/miss_count- hit ratio = hit / (hit + miss). >50% indicates the cache is earning its heap.cache_sizeandcache_count- current entries.evictions- if growing rapidly, cache is undersized for the workload.memory_size_in_bytes- actual memory used.
Pulse tracks query cache hit ratios, evictions, and memory pressure per node, and correlates cache misses with the queries that drive them. Pulse's automated analysis spots patterns where a workload is paying for cache space it doesn't benefit from - and surfaces the specific filter clauses that would gain from better caching or rewriting.
Common Query Cache Pitfalls
- Assuming the cache speeds up every query. It only helps reusable filter clauses on segments that meet the size thresholds.
- Tuning cache size up to compensate for hot-shard imbalance. The fix is shard placement, not bigger caches.
- Conflating query cache with shard request cache. They cache different things; tuning the wrong one wastes effort.
- Forgetting that refresh invalidates the cache for segments that get merged away. Long refresh intervals make the cache more effective on append-only workloads.
- Putting
now(or anything time-dependent) inside a filter.rangewithnowis non-deterministic and won't cache effectively; round to a fixed boundary (now/m,now/h) so the same key recurs.
Frequently Asked Questions
Q: Is the Elasticsearch query cache enabled by default?
A: Yes. The node query cache is enabled by default, sized at 10% of JVM heap. Individual indices can disable it via index.queries.cache.enabled: false, but this is rarely necessary.
Q: What is the difference between the query cache and the request cache?
A: The query cache stores filter clause bitsets per segment, reusable across different queries that share the same filter. The shard request cache stores entire query results for size: 0 and aggregation requests. They're complementary, not interchangeable.
Q: How do I check Elasticsearch query cache hit ratio?
A: GET /_nodes/stats/indices/query_cache returns hit_count and miss_count per node. Hit ratio = hit / (hit + miss). A healthy ratio is typically 30-70% depending on workload; <20% suggests the cache isn't helping much.
Q: How much memory does the Elasticsearch query cache use?
A: Up to indices.queries.cache.size (default 10% of JVM heap), capped by indices.queries.cache.count (default 10,000 entries). The cache uses heap memory, so size matters when the heap is also serving fielddata, aggregations, and active queries.
Q: Why do my filters not get cached in Elasticsearch?
A: Three common reasons: (1) the segment is smaller than 10,000 docs or less than 3% of the index, (2) the filter is too cheap (Elasticsearch's heuristics skip cheap filters), or (3) the filter is non-deterministic (e.g., range with now). Round time filters to a coarser boundary like now/m to make them cacheable.
Q: When should I disable the Elasticsearch query cache?
A: Almost never. The default heuristics avoid caching unprofitable filters. Disable per-index only if you've measured that the cache is consuming heap without delivering hits - usually a sign of low filter reuse, not a cache problem to fix by disabling.
Related Reading
- What is Elasticsearch Refresh Interval: refresh invalidates segments and their cache entries
- What is Elasticsearch Fielddata: the other on-heap cache
- What is Elasticsearch Node: heap sizing affects cache capacity
- What is Elasticsearch Mapping: mapping shapes filter behavior
- Elasticsearch Aggregation Types: aggregations interact with the request cache