Elasticsearch Heap, CPU, and Shard Tuning Best Practices

Optimizing heap, CPU, and shard settings together is essential for a well-performing Elasticsearch cluster. This guide provides integrated best practices for these three critical areas.

Heap Tuning

The Golden Rules

Rule 1: Heap should be about half of RAM but never above 32 GB.

Rule 2: Set minimum and maximum heap to the same value.

Heap Sizing Table

Server RAM	Recommended Heap	Filesystem Cache
16 GB	8 GB	8 GB
32 GB	16 GB	16 GB
64 GB	31 GB	33 GB
128 GB	31 GB	97 GB

Configuration

# /etc/elasticsearch/jvm.options.d/heap.options
-Xms16g
-Xmx16g

Heap Tuning Checklist

Heap ≤ 50% of RAM
Heap ≤ 31 GB (for compressed oops)
Min heap = Max heap
Remaining RAM available for filesystem cache
Memory lock enabled (bootstrap.memory_lock: true)

Monitor Heap Health

GET /_cat/nodes?v&h=name,heap.percent,heap.current,heap.max
GET /_nodes/stats/jvm?filter_path=nodes.*.jvm.gc

Healthy indicators:

Heap usage < 75% normally
Heap usage < 85% during peaks
GC time < 5% of total time

CPU Tuning

CPU Requirements

Workload	CPU per Node	Notes
Light	4-8 cores	Basic search/indexing
Medium	8-16 cores	Mixed workloads
Heavy	16-32 cores	Complex queries, aggregations

Thread Pool Configuration

For search-heavy workloads:

# elasticsearch.yml
thread_pool.search.size: 25  # Default: (# of processors * 3) / 2 + 1
thread_pool.search.queue_size: 1000

For write-heavy workloads:

thread_pool.write.size: 16
thread_pool.write.queue_size: 2000

Monitor CPU Usage

GET /_cat/nodes?v&h=name,cpu,load_1m,load_5m,load_15m
GET /_nodes/hot_threads
GET /_cat/thread_pool?v&h=node_name,name,active,queue,rejected

CPU Optimization Strategies

For high CPU usage:

Optimize queries (avoid wildcards, deep pagination)
Reduce aggregation complexity
Add more data nodes
Implement query caching

For search-heavy loads:

Add replicas to distribute queries
Add coordinating nodes
Implement application-level caching

For indexing-heavy loads:

Increase refresh interval
Reduce replica count during bulk loading
Use bulk API effectively

Shard Tuning

Optimal Shard Size

Guideline: 10-50 GB per shard is usually optimal.

Workload Type	Recommended Shard Size
Search-focused	10-30 GB
Logging	30-50 GB
Analytics	30-50 GB

Shard Count Guidelines

Primary Shards = Total Data / Target Shard Size
Total Shards = Primary Shards × (1 + Replicas)

Limits:

< 1000 shards per node (soft limit)
Each shard uses ~1-10 MB heap for metadata

Check Shard Health

GET /_cluster/stats?filter_path=indices.shards
GET /_cat/shards?v&h=index,shard,prirep,store&s=store:desc
GET /_cat/allocation?v

Shard Optimization Strategies

For too many shards:

Shrink existing indices
Adjust index templates for fewer shards
Use ILM to rollover based on size
Delete old indices

For unbalanced shards:

PUT /_cluster/settings
{
  "persistent": {
    "cluster.routing.rebalance.enable": "all",
    "cluster.routing.allocation.balance.shard": 0.45
  }
}

Integrated Tuning Approach

Step 1: Assess Current State

# Comprehensive health check
GET /_cluster/health?pretty
GET /_cat/nodes?v&h=name,heap.percent,cpu,disk.used_percent,shards
GET /_cluster/stats?filter_path=indices.shards.total

Step 2: Identify Bottleneck

Symptom	Likely Bottleneck	Solution Focus
High heap usage	Too many shards or fielddata	Shard reduction
High CPU	Complex queries or merging	Query optimization
Slow searches	Shard size or count	Shard tuning
Slow indexing	CPU or disk I/O	Thread pools, refresh

Step 3: Apply Changes

Order of changes:

Heap configuration (requires restart)
Shard strategy (ILM policies, templates)
Thread pool settings (dynamic)
Query optimization (application changes)

Step 4: Validate

# Before and after comparison
GET /_nodes/stats/jvm
GET /_cat/thread_pool?v
GET /_cat/shards?v | wc -l

Configuration Reference

Production Node Configuration

# elasticsearch.yml

# Heap is configured in jvm.options.d/
# Set to 50% of RAM, max 31 GB

# Memory lock
bootstrap.memory_lock: true

# Thread pools (adjust based on cores)
thread_pool.search.size: 13
thread_pool.search.queue_size: 1000
thread_pool.write.size: 8
thread_pool.write.queue_size: 500

# Index settings via templates
# Use ILM for shard management

ILM Policy for Shard Management

PUT _ilm/policy/optimized_shards
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_primary_shard_size": "40gb",
            "max_age": "7d"
          }
        }
      },
      "warm": {
        "min_age": "30d",
        "actions": {
          "shrink": {
            "number_of_shards": 1
          },
          "forcemerge": {
            "max_num_segments": 1
          }
        }
      }
    }
  }
}

Quick Reference

Heap Quick Facts

Max: 31 GB (compressed oops)
Target: 50% of RAM
Monitor: Keep < 85%
GC overhead: Keep < 5%

CPU Quick Facts

Baseline: 8+ cores per data node
Monitor: Keep < 80% average
Hot threads: Check during high CPU
Thread pools: Size based on cores

Shard Quick Facts

Size: 10-50 GB optimal
Per node: < 1000 shards
Replicas: Usually 1
Management: Use ILM

Monitoring Dashboard Queries

// Key metrics to track
GET /_cat/nodes?v&h=name,heap.percent,cpu,disk.used_percent,shards

// Shard distribution
GET /_cat/allocation?v

// Thread pool health
GET /_cat/thread_pool?v&h=node_name,name,active,queue,rejected&s=rejected:desc