Meet the Pulse team at AWS re:Invent!

Read more

Elasticsearch Heap, CPU, and Shard Tuning Best Practices

Optimizing heap, CPU, and shard settings together is essential for a well-performing Elasticsearch cluster. This guide provides integrated best practices for these three critical areas.

Heap Tuning

The Golden Rules

Rule 1: Heap should be about half of RAM but never above 32 GB.

Rule 2: Set minimum and maximum heap to the same value.

Heap Sizing Table

Server RAM Recommended Heap Filesystem Cache
16 GB 8 GB 8 GB
32 GB 16 GB 16 GB
64 GB 31 GB 33 GB
128 GB 31 GB 97 GB

Configuration

# /etc/elasticsearch/jvm.options.d/heap.options
-Xms16g
-Xmx16g

Heap Tuning Checklist

  • Heap ≤ 50% of RAM
  • Heap ≤ 31 GB (for compressed oops)
  • Min heap = Max heap
  • Remaining RAM available for filesystem cache
  • Memory lock enabled (bootstrap.memory_lock: true)

Monitor Heap Health

GET /_cat/nodes?v&h=name,heap.percent,heap.current,heap.max
GET /_nodes/stats/jvm?filter_path=nodes.*.jvm.gc

Healthy indicators:

  • Heap usage < 75% normally
  • Heap usage < 85% during peaks
  • GC time < 5% of total time

CPU Tuning

CPU Requirements

Workload CPU per Node Notes
Light 4-8 cores Basic search/indexing
Medium 8-16 cores Mixed workloads
Heavy 16-32 cores Complex queries, aggregations

Thread Pool Configuration

For search-heavy workloads:

# elasticsearch.yml
thread_pool.search.size: 25  # Default: (# of processors * 3) / 2 + 1
thread_pool.search.queue_size: 1000

For write-heavy workloads:

thread_pool.write.size: 16
thread_pool.write.queue_size: 2000

Monitor CPU Usage

GET /_cat/nodes?v&h=name,cpu,load_1m,load_5m,load_15m
GET /_nodes/hot_threads
GET /_cat/thread_pool?v&h=node_name,name,active,queue,rejected

CPU Optimization Strategies

For high CPU usage:

  1. Optimize queries (avoid wildcards, deep pagination)
  2. Reduce aggregation complexity
  3. Add more data nodes
  4. Implement query caching

For search-heavy loads:

  1. Add replicas to distribute queries
  2. Add coordinating nodes
  3. Implement application-level caching

For indexing-heavy loads:

  1. Increase refresh interval
  2. Reduce replica count during bulk loading
  3. Use bulk API effectively

Shard Tuning

Optimal Shard Size

Guideline: 10-50 GB per shard is usually optimal.

Workload Type Recommended Shard Size
Search-focused 10-30 GB
Logging 30-50 GB
Analytics 30-50 GB

Shard Count Guidelines

Primary Shards = Total Data / Target Shard Size
Total Shards = Primary Shards × (1 + Replicas)

Limits:

  • < 1000 shards per node (soft limit)
  • Each shard uses ~1-10 MB heap for metadata

Check Shard Health

GET /_cluster/stats?filter_path=indices.shards
GET /_cat/shards?v&h=index,shard,prirep,store&s=store:desc
GET /_cat/allocation?v

Shard Optimization Strategies

For too many shards:

  1. Shrink existing indices
  2. Adjust index templates for fewer shards
  3. Use ILM to rollover based on size
  4. Delete old indices

For unbalanced shards:

PUT /_cluster/settings
{
  "persistent": {
    "cluster.routing.rebalance.enable": "all",
    "cluster.routing.allocation.balance.shard": 0.45
  }
}

Integrated Tuning Approach

Step 1: Assess Current State

# Comprehensive health check
GET /_cluster/health?pretty
GET /_cat/nodes?v&h=name,heap.percent,cpu,disk.used_percent,shards
GET /_cluster/stats?filter_path=indices.shards.total

Step 2: Identify Bottleneck

Symptom Likely Bottleneck Solution Focus
High heap usage Too many shards or fielddata Shard reduction
High CPU Complex queries or merging Query optimization
Slow searches Shard size or count Shard tuning
Slow indexing CPU or disk I/O Thread pools, refresh

Step 3: Apply Changes

Order of changes:

  1. Heap configuration (requires restart)
  2. Shard strategy (ILM policies, templates)
  3. Thread pool settings (dynamic)
  4. Query optimization (application changes)

Step 4: Validate

# Before and after comparison
GET /_nodes/stats/jvm
GET /_cat/thread_pool?v
GET /_cat/shards?v | wc -l

Configuration Reference

Production Node Configuration

# elasticsearch.yml

# Heap is configured in jvm.options.d/
# Set to 50% of RAM, max 31 GB

# Memory lock
bootstrap.memory_lock: true

# Thread pools (adjust based on cores)
thread_pool.search.size: 13
thread_pool.search.queue_size: 1000
thread_pool.write.size: 8
thread_pool.write.queue_size: 500

# Index settings via templates
# Use ILM for shard management

ILM Policy for Shard Management

PUT _ilm/policy/optimized_shards
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_primary_shard_size": "40gb",
            "max_age": "7d"
          }
        }
      },
      "warm": {
        "min_age": "30d",
        "actions": {
          "shrink": {
            "number_of_shards": 1
          },
          "forcemerge": {
            "max_num_segments": 1
          }
        }
      }
    }
  }
}

Quick Reference

Heap Quick Facts

  • Max: 31 GB (compressed oops)
  • Target: 50% of RAM
  • Monitor: Keep < 85%
  • GC overhead: Keep < 5%

CPU Quick Facts

  • Baseline: 8+ cores per data node
  • Monitor: Keep < 80% average
  • Hot threads: Check during high CPU
  • Thread pools: Size based on cores

Shard Quick Facts

  • Size: 10-50 GB optimal
  • Per node: < 1000 shards
  • Replicas: Usually 1
  • Management: Use ILM

Monitoring Dashboard Queries

// Key metrics to track
GET /_cat/nodes?v&h=name,heap.percent,cpu,disk.used_percent,shards

// Shard distribution
GET /_cat/allocation?v

// Thread pool health
GET /_cat/thread_pool?v&h=node_name,name,active,queue,rejected&s=rejected:desc
Pulse - Elasticsearch Operations Done Right

Pulse can solve your Elasticsearch issues

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.