index.requests.cache.enable controls whether the shard request cache is active for an index. The shard request cache stores the local results of search requests on each shard - specifically, the aggregation result, the hit count, and other metadata - and returns the cached value when an identical request is repeated. The cache only stores results from requests with size: 0 (no document hits) by default, which means it is primarily a Kibana / aggregation accelerator.
- Default:
true - Scope: Per-index, dynamic
- Possible values:
true,false - Per-request override:
?request_cache=true|false
How the Shard Request Cache Works
When a search arrives at a shard, Elasticsearch hashes the request body and checks whether it has a cached result for that shard. On hit, the cached result is returned without re-executing the query. On miss, the query runs normally and the result is cached if eligible.
The cache is invalidated whenever the shard is refreshed - i.e. when new data becomes searchable. The default refresh interval of 1 second means the cache is rebuilt frequently on actively-ingested indices, limiting its usefulness there. On indices that are rarely refreshed (cold tier, time-series indices past their refresh window), hit rates can exceed 90%.
Eligibility rules:
- Request must have
size: 0(no hits) or be a_countrequest - Request body must be deterministically hashable - dynamic dates (
now) prevent caching unless rounded - The
request_cache=trueURL parameter forces caching for requests withsize > 0, but the user is responsible for the result correctness
Configuring index.requests.cache.enable
Disable per-index:
PUT /my_index/_settings
{
"index.requests.cache.enable": false
}
Override per request:
GET /my_index/_search?request_cache=false
{ "size": 0, "aggs": { ... } }
Inspect cache stats:
GET /my_index/_stats/request_cache
Returns memory_size_in_bytes, evictions, hit_count, miss_count.
When to Disable the Cache
| Scenario | Recommendation |
|---|---|
| Dashboards over time-series indices | Keep enabled - high hit rates |
| Indices with rapidly-changing data and varied queries | Consider disabling - low hit rate, memory wasted |
| Heavy per-tenant query variation (no repetition) | Disable - cache memory better spent on field data |
size > 0 searches with dynamic now-based ranges |
Disable or use rounded dates |
The cache is bounded by indices.requests.cache.size (default 1% of heap). Once the cache fills, LRU eviction begins. A low hit rate with many evictions is a sign the cache should be disabled or sized larger.
Common Pitfalls
- Using
nowin date range queries.nowresolves to the current millisecond, so every query hashes to a different key and never hits the cache. Round tonow/mornow/hfor caching to work. - Forcing caching with
request_cache=trueonsize > 0queries without realizing the cached "hits" are frozen at refresh time. - Expecting cache hits during heavy indexing. Default 1-second refresh invalidates the cache continuously.
- Disabling the cache to "save memory" without checking the actual hit rate. On dashboard-heavy clusters the cache often pays for itself many times over.
Monitoring Cache Effectiveness
GET /_cat/nodes?v&h=name,request_cache.memory_size,request_cache.evictions,request_cache.hit_count,request_cache.miss_count
Hit rate = hit_count / (hit_count + miss_count). A hit rate below 20% on an index where you expected caching is a sign of dynamic now-based queries or aggressive refresh.
Pulse tracks shard request cache hit rate per index, evictions, and the memory pressure on the cache, and surfaces indices where disabling caching would free heap for hot paths - or where rounding query date ranges would unlock a 10x latency win on dashboards.
Frequently Asked Questions
Q: What does the Elasticsearch shard request cache do?
A: The shard request cache stores the per-shard result of search requests - aggregations, hit counts, metadata - and returns the cached value when an identical request is repeated. It's enabled by default and primarily benefits dashboard-style aggregation workloads.
Q: Why is my request cache hit rate so low?
A: Three usual reasons: (1) queries use now without rounding, so every request hashes to a unique key, (2) the index refreshes often (default 1 second) and invalidates entries, or (3) the cache is undersized and entries evict before being reused.
Q: How do I disable the request cache for a specific query?
A: Add ?request_cache=false to the request URL. This overrides the index-level setting for that one call. Useful when running test queries or queries where stale data is unacceptable.
Q: Does the request cache work with size > 0 queries?
A: Not by default. The cache only stores aggregation and count results from size: 0 (or _count) requests. You can force caching of size > 0 queries with ?request_cache=true, but cached hits become stale at the next refresh.
Q: How large is the Elasticsearch request cache?
A: It defaults to 1% of the JVM heap, capped per-node. Adjust via indices.requests.cache.size in elasticsearch.yml. Monitor evictions to confirm sizing is adequate.
Q: Is the request cache invalidated on every refresh?
A: Yes. When a shard refreshes (default every 1 second on an actively-written index, longer on time-series indices), its cache entries are evicted. Indices with longer refresh intervals get more cache value.
Related Reading
- Elasticsearch Index Refresh Interval: Refresh interval interacts with cache
- Elasticsearch search.max_buckets Setting: Aggregation bucket limit
- Elasticsearch Slow Queries Diagnose: Identify slow queries
- Elasticsearch JVM Heap Pressure High: Heap-pressure context
- Elasticsearch Optimize Search Queries: Search optimization