validation_exception: Validation Failed: 1: this action would add [N] total shards, but this cluster currently has [M]/[max] maximum shards open is Elasticsearch's way of telling you the cluster has hit its `cluster.max_shards_per_node` ceiling (default 1000 per non-frozen data node since 7.0). The error blocks new index creation and shard opening. Even before the hard limit, an oversharded cluster suffers from slow cluster state updates, long node-join times, GC pressure on the master, and tail-latency spikes during recovery. Elastic's own guidance is to target no more than 20 shards per GB of JVM heap.
What This Error Means
Every open shard consumes JVM heap on its host node: roughly 1-2 MB for cluster state metadata, plus heap for filter caches, segment metadata, and fielddata that accumulates with use. Multiply by tens of thousands and the master spends most of its time propagating cluster-state diffs and the data nodes spend their heap on overhead rather than serving queries. The shard-limit error is the safety net; the symptoms hit well before it.
Common Causes
- Time-series indices with no delete phase in ILM. Daily indices accumulate forever.
- Default
number_of_shards: 5from old templates. This was the Elasticsearch 6.x default and is preserved in many production templates. Since 7.0 the default is 1. - One index per tenant or per dimension. "Multi-tenant" architectures that create a new index per customer hit the limit fast.
- Daily indices for low-volume data. A 100 MB-per-day index with 1 primary and 1 replica burns 2 shards every day for 100 MB of useful data.
- High replica count on lots of small indices. Total shards = primaries x (1 + replicas).
- Cluster scaling not aligned with data growth. Stable node count, growing data and indices.
Impact
| Symptom | Trigger |
|---|---|
validation_exception on index create |
Hit cluster.max_shards_per_node ceiling |
Slow GET /_cluster/state |
High shard count inflates state size |
| Long node-join time | Each rejoining node must process the full cluster state |
| Master node GC pressure | Cluster state allocations dominate heap |
| Slow recovery after node restart | Many small shards to recover in serial |
| Hot or unbalanced shard counts | Allocator decisions degrade with shard count |
How to Fix
The recipe depends on which cause applies. Run the diagnostic queries first.
Diagnose
# Total shard count, per-node distribution, disk usage
GET /_cat/allocation?v
# Largest and smallest indices by shard count and primary store
GET /_cat/indices?v&s=pri.store.size:desc
# Sort indices by age
GET /_cat/indices?v&h=index,creation.date.string,pri,rep,docs.count,pri.store.size&s=creation.date.string
# Per-index store size and shard count
GET /_cat/shards?v&h=index,shard,prirep,store
Fix patterns
Delete old time-series indices. Or set up an ILM policy with a delete phase if you have not already. Data streams plus an ILM policy that rolls over and deletes is the canonical pattern.
Consolidate small indices. Replace per-tenant indices with a single index keyed by
tenant_id, queried via filtered aliases or routing. One 50 GB shared index is far cheaper than 500 indices of 100 MB.Shrink oversharded indices. For a static index with too many small primaries:
# Step 1: mark read-only and prepare
PUT /old-index/_settings
{ "settings": { "index.blocks.write": true, "index.routing.allocation.require._name": "node-1" } }
# Step 2: shrink to fewer primaries (must be a divisor of original count)
POST /old-index/_shrink/old-index-shrunk
{ "settings": { "index.number_of_shards": 1, "index.number_of_replicas": 1, "index.blocks.write": null } }
Fix the template defaults. Drop
number_of_shardsto 1 for low-volume indices in your index templates. Use size-based rollover via `max_primary_shard_size` (40 GB target) for time-series workloads.Force-merge old indices. This does not reduce shard count, but it reduces segment count per shard, which reduces heap overhead and speeds recovery:
POST /old-index/_forcemerge?max_num_segments=1
Only force-merge indices that will not receive further writes.
- As a last resort, raise the cap. If shard sizes are already within the 10-50 GB target and you genuinely need more shards:
PUT /_cluster/settings
{ "persistent": { "cluster.max_shards_per_node": 1500 } }
This buys time; it does not fix structural oversharding.
Preventive Measures
- Set
number_of_shards: 1as the default in index templates unless an index is known to need more. - Adopt size-based rollover.
max_primary_shard_size: 40gbplus a max_age cap. Avoid pure max_age rollover for variable-volume streams. - Use data streams for new time-series workloads. They roll over and delete on a schedule with less ceremony than manual aliases.
- Alert on cluster-wide shard count, not just disk. Page when shards reach 70% of the cluster cap (
max_shards_per_node x data_nodes). - Audit templates quarterly. Templates set years ago for older data volumes are a frequent cause of silent oversharding.
Catch Shard Sprawl Before It Triggers validation_exception with Pulse
Pulse is an AI DBA for Elasticsearch and OpenSearch. When your cluster approaches cluster.max_shards_per_node (default 1000) and validation_exception: this action would add [N] total shards, but this cluster currently has [M]/[max] maximum shards open is about to fire, Pulse:
- Continuously tracks cluster-wide shard count, per-node shard distribution, and the ratio of shards to JVM heap against the 20-shards-per-GB-heap guideline
- Correlates the shard growth curve with index templates setting
number_of_shards, ILM policies missing a delete phase, per-tenant index sprawl, and oversharded time-series indices below 1 GB primary store - Identifies which of the six causes above is responsible by naming the indices contributing the most shards relative to data and the templates supplying the wrong default
- Recommends the precise fix - apply
_shrinkto consolidate primaries, add an ILM delete phase, drop templatenumber_of_shardsto 1, switch tomax_primary_shard_size: 40gbsize-based rollover, or_forcemergestatic indices - Applies low-risk fixes automatically with your approval (deleting orphaned indices past the intended retention, fixing replicas left at 0), or generates a one-click template PR
Pulse turns the manual audit above into an agentic SRE workflow. Start a free trial.
Frequently Asked Questions
Q: What tool can automatically detect oversharding before it causes a validation_exception?
A: Pulse is an AI DBA for Elasticsearch and OpenSearch that continuously tracks shard count against cluster.max_shards_per_node, the 20-shards-per-GB-heap guideline, template defaults, and ILM coverage. It pages when shards reach 70% of the cluster cap, names the indices and templates driving the growth, and recommends shrink, rollover, or ILM changes before the cluster hits the hard limit.
Q: How many shards is too many in Elasticsearch?
A: Elastic recommends no more than 20 shards per GB of JVM heap, which on a 30 GB heap is ~600 shards per node. The hard cluster cap is cluster.max_shards_per_node, default 1000 since 7.0. Stay well below the cap.
Q: How do I see how many shards each node has in Elasticsearch?
A: Use GET /_cat/allocation?v. The shards column shows the per-node shard count, alongside disk usage and node names. Use GET /_cat/shards?v for the per-shard view.
Q: Can I change the number of primary shards in an existing index?
A: Not directly. Use the Shrink API to reduce primary count (the new count must be a divisor of the old), the Split API to increase (new count must be a multiple of old), or reindex into a new index with the desired shard count.
Q: Why is my Elasticsearch cluster slow even with plenty of CPU and disk?
A: A common cause is oversharding. Each shard adds heap and cluster-state overhead independent of how much data it holds. Cluster-state propagation, GC pressure, and per-shard query coordination all scale with shard count, not data size.
Q: Should I raise cluster.max_shards_per_node when I hit the limit?
A: Usually no. Raising the cap hides the problem and accumulates heap pressure. Fix the structural cause first (consolidate small indices, set up ILM delete, shrink oversharded indices, tighten template defaults). Raise the cap only when shards are appropriately sized (10-50 GB each).
Q: What is the ideal shard size in Elasticsearch?
A: 10-50 GB for most workloads, with 20-40 GB the sweet spot. Logging and analytics tolerate the upper end (40-50 GB); search-heavy workloads benefit from 10-30 GB to keep search latency low. Avoid shards under 1 GB in production.