Meet the Pulse team at AWS re:Invent!

Read more

Elasticsearch JVM Heap Pressure High (Above 85%)

High JVM heap pressure in Elasticsearch occurs when memory usage consistently exceeds safe thresholds. When heap usage rises above 85%, immediate action is required to prevent cluster instability, degraded performance, and potential OutOfMemory errors.

Understanding Heap Pressure Thresholds

Heap Usage Status Action Required
< 75% Healthy Normal operation
75-85% Warning Monitor closely, consider optimization
> 85% Critical Immediate action required
> 95% Emergency Circuit breakers will trigger

Checking Current Heap Pressure

Using the Nodes Stats API

GET /_nodes/stats/jvm

Look for:

  • jvm.mem.heap_used_percent - current heap usage percentage
  • jvm.mem.heap_used_in_bytes - actual bytes used
  • jvm.gc - garbage collection statistics

Using the Cat Nodes API

Quick overview of heap usage across all nodes:

GET /_cat/nodes?v&h=name,heap.percent,heap.current,heap.max,cpu,load_1m

Calculating Memory Pressure

For detailed old generation pool analysis:

GET /_nodes/stats?filter_path=nodes.*.jvm.mem.pools.old

Memory pressure is calculated as: (used_in_bytes / max_in_bytes) * 100

Root Causes of High Heap Pressure

1. Too Many Shards

Every shard consumes heap memory for metadata and segment information. Excessive shards are the most common cause of sustained high heap pressure.

Diagnosis:

GET /_cat/shards?v | wc -l
GET /_cluster/stats?filter_path=indices.shards

Solution: Aim for fewer, larger shards (10-50 GB each).

2. Large Aggregations

Aggregations with high cardinality or large bucket sizes consume significant heap.

Solution:

  • Reduce size parameter in aggregations
  • Use composite aggregation for high-cardinality fields
  • Avoid aggregating on text fields (use keyword instead)

3. Fielddata Usage

Fielddata for text fields is very memory-intensive.

Diagnosis:

GET /_cat/fielddata?v

Solution:

  • Avoid fielddata: true on text fields
  • Use keyword fields for sorting and aggregations
  • Clear fielddata cache if needed:
POST /_cache/clear?fielddata=true

4. Large Bulk Requests

Oversized bulk requests create temporary memory pressure.

Solution:

  • Keep bulk requests between 5-15 MB
  • Reduce concurrent bulk indexing clients

5. Expensive Queries

Queries with large result sets or complex operations spike memory usage.

Solution:

  • Limit size parameter in searches
  • Use search_after for pagination
  • Set query timeouts

Immediate Actions for High Heap Pressure

Step 1: Identify the Cause

Check what's consuming memory:

GET /_nodes/stats/indices/fielddata?fields=*
GET /_nodes/stats/indices/query_cache
GET /_nodes/stats/indices/request_cache

Step 2: Clear Caches (Temporary Relief)

POST /_cache/clear

For specific caches:

POST /_cache/clear?fielddata=true
POST /_cache/clear?query=true
POST /_cache/clear?request=true

Step 3: Cancel Resource-Intensive Tasks

Identify and cancel problematic operations:

GET /_tasks?detailed=true&group_by=parents
POST /_tasks/{task_id}/_cancel

Step 4: Reduce Load

  • Temporarily reduce indexing rate
  • Add query throttling
  • Redirect traffic from affected nodes

Long-Term Solutions

Optimize Shard Configuration

Reduce total shard count by:

  • Increasing shard size (target 10-50 GB)
  • Using ILM to roll over and delete old indices
  • Merging small indices

Scale the Cluster

  • Vertical scaling: Increase heap size (up to 32 GB max)
  • Horizontal scaling: Add more data nodes

Heap Sizing Best Practices

Important: Heap should be about half of RAM but never above 32 GB.

# In jvm.options.d/custom.options
-Xms16g
-Xmx16g

Monitor and Alert

Set up alerts for:

  • Heap usage > 75% (warning)
  • Heap usage > 85% (critical)
  • GC time > 5% of total time

Monitoring Garbage Collection

High heap pressure causes frequent and long GC pauses:

GET /_nodes/stats/jvm?filter_path=nodes.*.jvm.gc

Watch for:

  • collection_count - increasing rapidly indicates pressure
  • collection_time_in_millis - long GC times impact performance
Pulse - Elasticsearch Operations Done Right

Pulse can solve your Elasticsearch issues

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.