High I/O wait (iowait) occurs when the CPU is idle but waiting for disk operations to complete. In Elasticsearch, this leads to degraded performance, slow queries, and potential cluster instability. This guide provides solutions to reduce iowait.
Understanding IOWait
What is IOWait?
IOWait represents the percentage of time the CPU spends waiting for I/O operations. High iowait indicates:
- Disk is the bottleneck
- CPU could do more work if disk was faster
- Operations are queuing for disk access
Healthy vs. Problematic Levels
| IOWait % | Status | Action |
|---|---|---|
| < 5% | Healthy | Normal operation |
| 5-20% | Warning | Monitor and investigate |
| > 20% | Critical | Immediate action needed |
Diagnosing High IOWait
Check Current IOWait
# Quick check
top
# Look for "%wa" in the CPU line
# Detailed view
vmstat 1 10
# Check "wa" column
# Per-CPU breakdown
mpstat -P ALL 1 5
Identify I/O-Heavy Processes
# Show I/O by process
iotop -o
# Show disk utilization
iostat -x 1 5
Elasticsearch-Specific Checks
GET /_nodes/stats/fs
GET /_cat/thread_pool?v&h=node_name,name,active,queue&s=queue:desc
GET /_nodes/hot_threads?type=wait
Causes and Fixes
Fix 1: Upgrade to SSDs
The most impactful change for high iowait:
Before (HDD):
- Random I/O: ~100-200 IOPS
- Latency: 5-15ms
After (SSD):
- Random I/O: 10,000-100,000+ IOPS
- Latency: <1ms
# Verify disk type
cat /sys/block/sda/queue/rotational
# 1 = HDD, 0 = SSD
Fix 2: Reduce Merge Activity
Segment merging causes significant I/O:
PUT /my-index/_settings
{
"index.merge.scheduler.max_thread_count": 1,
"index.merge.policy.max_merged_segment": "5gb",
"index.merge.policy.segments_per_tier": 10
}
Fix 3: Increase Refresh Interval
Reduce flush frequency:
PUT /my-index/_settings
{
"index.refresh_interval": "30s"
}
For bulk indexing, disable refresh temporarily:
PUT /my-index/_settings
{
"index.refresh_interval": "-1"
}
// After bulk indexing
PUT /my-index/_settings
{
"index.refresh_interval": "1s"
}
Fix 4: Optimize Translog Settings
PUT /my-index/_settings
{
"index.translog.durability": "async",
"index.translog.sync_interval": "30s",
"index.translog.flush_threshold_size": "1gb"
}
Fix 5: Ensure Adequate Filesystem Cache
High cache miss rate causes read I/O:
# Check cache usage
free -h
# Look at "buff/cache" column
Solution: Keep heap ≤ 50% of RAM to leave memory for OS cache.
Fix 6: Limit Recovery Bandwidth
During node recovery:
PUT /_cluster/settings
{
"persistent": {
"indices.recovery.max_bytes_per_sec": "40mb"
}
}
Fix 7: Schedule Force Merges
Instead of continuous merging, schedule during off-peak:
POST /my-index/_forcemerge?max_num_segments=1
Run this during maintenance windows, not during peak traffic.
Fix 8: Use Multiple Data Paths
Distribute I/O across disks:
# elasticsearch.yml
path.data:
- /mnt/disk1/elasticsearch
- /mnt/disk2/elasticsearch
Fix 9: Disable Swap
Swap causes massive I/O issues:
# Disable swap
swapoff -a
# Or set swappiness very low
echo 1 > /proc/sys/vm/swappiness
Enable memory locking in Elasticsearch:
# elasticsearch.yml
bootstrap.memory_lock: true
Fix 10: Filesystem Optimization
# Mount with optimized options
# /etc/fstab
/dev/sda1 /data ext4 defaults,noatime,nodiratime 0 0
# For SSDs, ensure TRIM is enabled
fstrim -v /data
Index-Level Optimization
Write-Heavy Indices
PUT /logs-write-heavy
{
"settings": {
"index.refresh_interval": "60s",
"index.translog.durability": "async",
"index.translog.sync_interval": "60s",
"index.merge.scheduler.max_thread_count": 1,
"index.number_of_replicas": 0
}
}
Note: Increase replicas after bulk loading.
Read-Heavy Indices
PUT /search-index
{
"settings": {
"index.store.type": "hybridfs",
"index.queries.cache.enabled": true
}
}
Monitoring IOWait
Set Up Alerts
Alert when:
- IOWait > 15% for 5+ minutes
- Disk utilization > 80%
- I/O latency > 20ms average
Continuous Monitoring
# Log iowait every minute
while true; do
echo "$(date): $(vmstat 1 2 | tail -1 | awk '{print $16}')" >> /var/log/iowait.log
sleep 60
done
Elasticsearch Monitoring
GET /_cat/nodes?v&h=name,disk.used_percent,load_1m,cpu
Prevention Checklist
- SSDs in use (NVMe preferred)
- No swap or swap disabled
- Heap ≤ 50% of RAM
- Filesystem mounted with noatime
- Merge threads limited
- Refresh interval appropriate
- Recovery bandwidth limited
- Monitoring and alerting configured
Quick Fixes During High IOWait
If experiencing high iowait right now:
// 1. Reduce indexing rate (inform clients)
// 2. Disable refresh temporarily
PUT /*/_settings
{
"index.refresh_interval": "-1"
}
// 3. Stop non-essential recoveries
PUT /_cluster/settings
{
"transient": {
"cluster.routing.allocation.enable": "none"
}
}
// 4. After stabilization, re-enable
PUT /*/_settings
{
"index.refresh_interval": "30s"
}
PUT /_cluster/settings
{
"transient": {
"cluster.routing.allocation.enable": "all"
}
}