Shard imbalance occurs when shards are unevenly distributed across nodes, leading to some nodes being overloaded while others remain underutilized. This imbalance degrades overall cluster performance and can cause cascading failures.
Identifying Shard Imbalance
Check Shard Distribution
GET /_cat/allocation?v&s=shards:desc
This shows:
- Number of shards per node
- Disk space used per node
- Disk indices (percentage used)
Compare Node Resources
GET /_cat/nodes?v&h=name,cpu,heap.percent,disk.used_percent,shards
Look for nodes with significantly higher values.
Check Shard Sizes
GET /_cat/shards?v&s=store:desc
Identify oversized shards or uneven shard sizes.
Types of Shard Imbalance
1. Count Imbalance
Some nodes have significantly more shards than others.
Cause: Allocation settings, node attributes, or failed rebalancing.
Solution:
// Trigger rebalancing
PUT /_cluster/settings
{
"transient": {
"cluster.routing.rebalance.enable": "all",
"cluster.routing.allocation.cluster_concurrent_rebalance": 5
}
}
2. Size Imbalance
Some nodes store much more data despite similar shard counts.
Cause: Uneven shard sizes, hot indices routed to specific nodes.
Solution: Use disk-based allocation awareness:
PUT /_cluster/settings
{
"persistent": {
"cluster.routing.allocation.disk.threshold_enabled": true,
"cluster.routing.allocation.disk.watermark.low": "85%",
"cluster.routing.allocation.disk.watermark.high": "90%",
"cluster.routing.allocation.balance.disk_usage": 0.45
}
}
3. Hot Spot Imbalance
Certain shards receive disproportionate traffic.
Cause: Poor routing, time-based indices with uneven access patterns.
Solution: Implement proper index routing or use rollover indices.
Root Causes and Fixes
Cause 1: Allocation Filtering
Nodes may be excluded from allocation:
GET /_cluster/settings?include_defaults=true&filter_path=*.cluster.routing.allocation.*
Fix: Review and adjust allocation filters:
PUT /_cluster/settings
{
"transient": {
"cluster.routing.allocation.exclude._name": null
}
}
Cause 2: Awareness Attributes
Zone or rack awareness may cause imbalance:
Fix: Ensure equal node capacity per zone:
PUT /_cluster/settings
{
"persistent": {
"cluster.routing.allocation.awareness.attributes": "zone",
"cluster.routing.allocation.awareness.force.zone.values": "zone1,zone2,zone3"
}
}
Cause 3: Index-Level Settings
Some indices may have routing requirements:
GET /my-index/_settings?filter_path=*.settings.index.routing
Fix: Remove or adjust index-level allocation settings:
PUT /my-index/_settings
{
"index.routing.allocation.require._name": null
}
Cause 4: Insufficient Rebalancing
The cluster may not be rebalancing aggressively enough:
Fix: Adjust rebalancing settings:
PUT /_cluster/settings
{
"persistent": {
"cluster.routing.allocation.balance.threshold": 1.0,
"cluster.routing.allocation.balance.shard": 0.45,
"cluster.routing.allocation.balance.index": 0.55
}
}
Cause 5: Primary Shard Concentration
Primary shards concentrated on fewer nodes:
Fix: Enable even primary distribution:
PUT /_cluster/settings
{
"persistent": {
"cluster.routing.allocation.balance.prefer_primary": true
}
}
Manual Rebalancing
Move Specific Shards
For critical imbalances, manually move shards:
POST /_cluster/reroute
{
"commands": [
{
"move": {
"index": "my-index",
"shard": 0,
"from_node": "overloaded-node",
"to_node": "underutilized-node"
}
}
]
}
Explain Allocation Decisions
Understand why a shard is on a particular node:
GET /_cluster/allocation/explain
{
"index": "my-index",
"shard": 0,
"primary": true
}
Preventing Future Imbalance
1. Use Consistent Shard Sizing
Target 10-50 GB per shard with ILM:
PUT _ilm/policy/balanced_shards
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_primary_shard_size": "40gb",
"max_age": "7d"
}
}
}
}
}
}
2. Configure Index Templates
Set appropriate shard counts based on data volume:
PUT _index_template/balanced_template
{
"index_patterns": ["logs-*"],
"template": {
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
}
}
}
3. Monitor Regularly
Set up alerts for imbalance:
- Maximum variance in shard count > 20%
- Maximum variance in disk usage > 30%
- Any node consistently at higher CPU/memory
Metrics to Track
| Metric | Healthy Range | Action Threshold |
|---|---|---|
| Shard count variance | < 10% | > 20% |
| Disk usage variance | < 15% | > 30% |
| CPU usage variance | < 20% | > 40% |
| Query latency variance | < 25% | > 50% |