Optimizing heap, CPU, and shard settings together is essential for a well-performing Elasticsearch cluster. This guide provides integrated best practices for these three critical areas.
Heap Tuning
The Golden Rules
Rule 1: Heap should be about half of RAM but never above 32 GB.
Rule 2: Set minimum and maximum heap to the same value.
Heap Sizing Table
| Server RAM | Recommended Heap | Filesystem Cache |
|---|---|---|
| 16 GB | 8 GB | 8 GB |
| 32 GB | 16 GB | 16 GB |
| 64 GB | 31 GB | 33 GB |
| 128 GB | 31 GB | 97 GB |
Configuration
# /etc/elasticsearch/jvm.options.d/heap.options
-Xms16g
-Xmx16g
Heap Tuning Checklist
- Heap ≤ 50% of RAM
- Heap ≤ 31 GB (for compressed oops)
- Min heap = Max heap
- Remaining RAM available for filesystem cache
- Memory lock enabled (
bootstrap.memory_lock: true)
Monitor Heap Health
GET /_cat/nodes?v&h=name,heap.percent,heap.current,heap.max
GET /_nodes/stats/jvm?filter_path=nodes.*.jvm.gc
Healthy indicators:
- Heap usage < 75% normally
- Heap usage < 85% during peaks
- GC time < 5% of total time
CPU Tuning
CPU Requirements
| Workload | CPU per Node | Notes |
|---|---|---|
| Light | 4-8 cores | Basic search/indexing |
| Medium | 8-16 cores | Mixed workloads |
| Heavy | 16-32 cores | Complex queries, aggregations |
Thread Pool Configuration
For search-heavy workloads:
# elasticsearch.yml
thread_pool.search.size: 25 # Default: (# of processors * 3) / 2 + 1
thread_pool.search.queue_size: 1000
For write-heavy workloads:
thread_pool.write.size: 16
thread_pool.write.queue_size: 2000
Monitor CPU Usage
GET /_cat/nodes?v&h=name,cpu,load_1m,load_5m,load_15m
GET /_nodes/hot_threads
GET /_cat/thread_pool?v&h=node_name,name,active,queue,rejected
CPU Optimization Strategies
For high CPU usage:
- Optimize queries (avoid wildcards, deep pagination)
- Reduce aggregation complexity
- Add more data nodes
- Implement query caching
For search-heavy loads:
- Add replicas to distribute queries
- Add coordinating nodes
- Implement application-level caching
For indexing-heavy loads:
- Increase refresh interval
- Reduce replica count during bulk loading
- Use bulk API effectively
Shard Tuning
Optimal Shard Size
Guideline: 10-50 GB per shard is usually optimal.
| Workload Type | Recommended Shard Size |
|---|---|
| Search-focused | 10-30 GB |
| Logging | 30-50 GB |
| Analytics | 30-50 GB |
Shard Count Guidelines
Primary Shards = Total Data / Target Shard Size
Total Shards = Primary Shards × (1 + Replicas)
Limits:
- < 1000 shards per node (soft limit)
- Each shard uses ~1-10 MB heap for metadata
Check Shard Health
GET /_cluster/stats?filter_path=indices.shards
GET /_cat/shards?v&h=index,shard,prirep,store&s=store:desc
GET /_cat/allocation?v
Shard Optimization Strategies
For too many shards:
- Shrink existing indices
- Adjust index templates for fewer shards
- Use ILM to rollover based on size
- Delete old indices
For unbalanced shards:
PUT /_cluster/settings
{
"persistent": {
"cluster.routing.rebalance.enable": "all",
"cluster.routing.allocation.balance.shard": 0.45
}
}
Integrated Tuning Approach
Step 1: Assess Current State
# Comprehensive health check
GET /_cluster/health?pretty
GET /_cat/nodes?v&h=name,heap.percent,cpu,disk.used_percent,shards
GET /_cluster/stats?filter_path=indices.shards.total
Step 2: Identify Bottleneck
| Symptom | Likely Bottleneck | Solution Focus |
|---|---|---|
| High heap usage | Too many shards or fielddata | Shard reduction |
| High CPU | Complex queries or merging | Query optimization |
| Slow searches | Shard size or count | Shard tuning |
| Slow indexing | CPU or disk I/O | Thread pools, refresh |
Step 3: Apply Changes
Order of changes:
- Heap configuration (requires restart)
- Shard strategy (ILM policies, templates)
- Thread pool settings (dynamic)
- Query optimization (application changes)
Step 4: Validate
# Before and after comparison
GET /_nodes/stats/jvm
GET /_cat/thread_pool?v
GET /_cat/shards?v | wc -l
Configuration Reference
Production Node Configuration
# elasticsearch.yml
# Heap is configured in jvm.options.d/
# Set to 50% of RAM, max 31 GB
# Memory lock
bootstrap.memory_lock: true
# Thread pools (adjust based on cores)
thread_pool.search.size: 13
thread_pool.search.queue_size: 1000
thread_pool.write.size: 8
thread_pool.write.queue_size: 500
# Index settings via templates
# Use ILM for shard management
ILM Policy for Shard Management
PUT _ilm/policy/optimized_shards
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_primary_shard_size": "40gb",
"max_age": "7d"
}
}
},
"warm": {
"min_age": "30d",
"actions": {
"shrink": {
"number_of_shards": 1
},
"forcemerge": {
"max_num_segments": 1
}
}
}
}
}
}
Quick Reference
Heap Quick Facts
- Max: 31 GB (compressed oops)
- Target: 50% of RAM
- Monitor: Keep < 85%
- GC overhead: Keep < 5%
CPU Quick Facts
- Baseline: 8+ cores per data node
- Monitor: Keep < 80% average
- Hot threads: Check during high CPU
- Thread pools: Size based on cores
Shard Quick Facts
- Size: 10-50 GB optimal
- Per node: < 1000 shards
- Replicas: Usually 1
- Management: Use ILM
Monitoring Dashboard Queries
// Key metrics to track
GET /_cat/nodes?v&h=name,heap.percent,cpu,disk.used_percent,shards
// Shard distribution
GET /_cat/allocation?v
// Thread pool health
GET /_cat/thread_pool?v&h=node_name,name,active,queue,rejected&s=rejected:desc