Elasticsearch index.requests.cache.enable Setting

index.requests.cache.enable controls whether the shard request cache is active for an index. The shard request cache stores the local results of search requests on each shard - specifically, the aggregation result, the hit count, and other metadata - and returns the cached value when an identical request is repeated. The cache only stores results from requests with size: 0 (no document hits) by default, which means it is primarily a Kibana / aggregation accelerator.

  • Default: true
  • Scope: Per-index, dynamic
  • Possible values: true, false
  • Per-request override: ?request_cache=true|false

How the Shard Request Cache Works

When a search arrives at a shard, Elasticsearch hashes the request body and checks whether it has a cached result for that shard. On hit, the cached result is returned without re-executing the query. On miss, the query runs normally and the result is cached if eligible.

The cache is invalidated whenever the shard is refreshed - i.e. when new data becomes searchable. The default refresh interval of 1 second means the cache is rebuilt frequently on actively-ingested indices, limiting its usefulness there. On indices that are rarely refreshed (cold tier, time-series indices past their refresh window), hit rates can exceed 90%.

Eligibility rules:

  • Request must have size: 0 (no hits) or be a _count request
  • Request body must be deterministically hashable - dynamic dates (now) prevent caching unless rounded
  • The request_cache=true URL parameter forces caching for requests with size > 0, but the user is responsible for the result correctness

Configuring index.requests.cache.enable

Disable per-index:

PUT /my_index/_settings
{
  "index.requests.cache.enable": false
}

Override per request:

GET /my_index/_search?request_cache=false
{ "size": 0, "aggs": { ... } }

Inspect cache stats:

GET /my_index/_stats/request_cache

Returns memory_size_in_bytes, evictions, hit_count, miss_count.

When to Disable the Cache

Scenario Recommendation
Dashboards over time-series indices Keep enabled - high hit rates
Indices with rapidly-changing data and varied queries Consider disabling - low hit rate, memory wasted
Heavy per-tenant query variation (no repetition) Disable - cache memory better spent on field data
size > 0 searches with dynamic now-based ranges Disable or use rounded dates

The cache is bounded by indices.requests.cache.size (default 1% of heap). Once the cache fills, LRU eviction begins. A low hit rate with many evictions is a sign the cache should be disabled or sized larger.

Common Pitfalls

  1. Using now in date range queries. now resolves to the current millisecond, so every query hashes to a different key and never hits the cache. Round to now/m or now/h for caching to work.
  2. Forcing caching with request_cache=true on size > 0 queries without realizing the cached "hits" are frozen at refresh time.
  3. Expecting cache hits during heavy indexing. Default 1-second refresh invalidates the cache continuously.
  4. Disabling the cache to "save memory" without checking the actual hit rate. On dashboard-heavy clusters the cache often pays for itself many times over.

Monitoring Cache Effectiveness

GET /_cat/nodes?v&h=name,request_cache.memory_size,request_cache.evictions,request_cache.hit_count,request_cache.miss_count

Hit rate = hit_count / (hit_count + miss_count). A hit rate below 20% on an index where you expected caching is a sign of dynamic now-based queries or aggressive refresh.

Pulse tracks shard request cache hit rate per index, evictions, and the memory pressure on the cache, and surfaces indices where disabling caching would free heap for hot paths - or where rounding query date ranges would unlock a 10x latency win on dashboards.

Frequently Asked Questions

Q: What does the Elasticsearch shard request cache do?
A: The shard request cache stores the per-shard result of search requests - aggregations, hit counts, metadata - and returns the cached value when an identical request is repeated. It's enabled by default and primarily benefits dashboard-style aggregation workloads.

Q: Why is my request cache hit rate so low?
A: Three usual reasons: (1) queries use now without rounding, so every request hashes to a unique key, (2) the index refreshes often (default 1 second) and invalidates entries, or (3) the cache is undersized and entries evict before being reused.

Q: How do I disable the request cache for a specific query?
A: Add ?request_cache=false to the request URL. This overrides the index-level setting for that one call. Useful when running test queries or queries where stale data is unacceptable.

Q: Does the request cache work with size > 0 queries?
A: Not by default. The cache only stores aggregation and count results from size: 0 (or _count) requests. You can force caching of size > 0 queries with ?request_cache=true, but cached hits become stale at the next refresh.

Q: How large is the Elasticsearch request cache?
A: It defaults to 1% of the JVM heap, capped per-node. Adjust via indices.requests.cache.size in elasticsearch.yml. Monitor evictions to confirm sizing is adequate.

Q: Is the request cache invalidated on every refresh?
A: Yes. When a shard refreshes (default every 1 second on an actively-written index, longer on time-series indices), its cache entries are evicted. Indices with longer refresh intervals get more cache value.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.