NoShardAvailableActionException: No shard available for [<index>][<shard>] is logged when an operation targets a primary (or, for reads, both primary and all replicas) shard that has no usable copy in the cluster. The shard is either unassigned, initializing, or all copies are on unreachable nodes. The request fails with a 503 and the affected index is partially or fully unavailable until allocation succeeds.
What This Error Means
Elasticsearch routes every search and index request to a specific shard copy. If no copy of the targeted shard is STARTED, the request cannot be served. This is almost always a symptom, not a root cause - the underlying issue is unassigned shards, and Elasticsearch ships the _cluster/allocation/explain API specifically to tell you why.
Common underlying conditions: a node hosting a primary shard left the cluster, the disk watermark was breached, replicas have not yet recovered, or an allocation filter excludes every eligible node.
Common Causes
- Node failure or network partition removed the shard's only copy from the cluster. How to confirm:
GET _cluster/healthshows reducednumber_of_nodes;GET _cat/nodes?vshows missing entries. - Disk high watermark exceeded; allocation paused. How to confirm:
GET _cat/allocation?v- any node above 90% disk usage is the cause (high watermark default). - Index replicas set to a count that exceeds available data nodes. How to confirm:
GET <index>/_settingsfornumber_of_replicas; with one data node andnumber_of_replicas: 1, replicas never assign. - Allocation filters (
include/exclude/require) match no node. How to confirm:GET _cluster/settingsand the index settings forrouting.allocation.*. - Maximum shards per node reached (default
cluster.max_shards_per_nodeis 1000). How to confirm: cluster log showsValidation Failed: this action would add [N] shards, but this cluster currently has [X]/[1000] maximum shards open. - Shard corruption preventing recovery. How to confirm:
GET _cluster/allocation/explainreportscorrupted_index_uuidorchecksum_failure.
How to Fix NoShardAvailableActionException
Get a definitive explanation for one unassigned shard:
GET /_cluster/allocation/explain { "index": "my-index", "shard": 0, "primary": true }The response's
allocate_explanationandnode_allocation_decisionsname the exact blocker.Check cluster health and unassigned-shard counts:
GET /_cluster/health?level=indices GET /_cat/shards?v&h=index,shard,prirep,state,unassigned.reasonIf disk watermark is the cause: free disk, expand storage, or temporarily relax watermarks:
PUT /_cluster/settings { "transient": { "cluster.routing.allocation.disk.watermark.low": "92%", "cluster.routing.allocation.disk.watermark.high": "95%" } }These are a stopgap - resize storage rather than leave them raised.
If a node is missing, bring it back online. Check Elasticsearch logs on the absent node; restart with
systemctl start elasticsearch. Elasticsearch will recover from replica or translog on its own.If allocation filters are wrong: clear or correct them:
PUT /my-index/_settings { "index.routing.allocation.require._tier_preference": null }Retry failed allocation explicitly (e.g., after fixing disk space):
POST /_cluster/reroute?retry_failed=trueIf the shard's only copy is corrupted, restore from snapshot or accept data loss by force-allocating an empty primary (last resort - data is lost):
POST /_cluster/reroute { "commands": [ { "allocate_empty_primary": { "index": "my-index", "shard": 0, "node": "data-1", "accept_data_loss": true } } ] }
Resolve NoShardAvailableActionException Automatically with Pulse
Pulse is an AI DBA for Elasticsearch and OpenSearch. When NoShardAvailableActionException: No shard available for [<index>][<shard>] fires, Pulse:
- Calls
_cluster/allocation/explainfor every affected shard, parsesallocate_explanationandnode_allocation_decisions, and correlates with_cat/allocation?vdisk usage,_cat/nodes?vmembership,_cat/shards?v&h=index,shard,prirep,state,unassigned.reason, and the master node's leaving/joining history - Identifies which of the six causes applies: missing node from a partition, high watermark breach at 90% disk, replica count exceeding data nodes, an
index.routing.allocation.*filter matching no node,cluster.max_shards_per_node: 1000saturation, or corruption flagged ascorrupted_index_uuid - Generates the exact remediation payload: the
POST /_cluster/reroute?retry_failed=truecall, thePUT /_cluster/settingswatermark adjustment, thePUT /<index>/_settingsfilter clear, or - as a last resort with explicit operator confirmation - theallocate_empty_primaryreroute withaccept_data_loss: true - Applies allocation setting changes and
?retry_failed=truereroutes automatically with operator approval; never force-allocates an empty primary without explicit confirmation because that path discards shard data
Pulse tracks unassigned shard count and pending_tasks continuously, alerting before a replica that cannot place becomes a primary that cannot serve.
Start a free trial to connect your cluster.
Frequently Asked Questions
Q: Why are my replicas unassigned even though the cluster has multiple nodes?
A: The most common reasons are routing.allocation.* filters that prevent the replica from being placed on any non-primary node, all eligible nodes being above the high disk watermark, or cluster.routing.allocation.same_shard.host: true blocking allocation onto the same physical host. _cluster/allocation/explain will tell you which.
Q: Is it safe to force shard allocation?
A: Forcing an empty primary with allocate_empty_primary discards all data for that shard - use only when the data is already lost. ?retry_failed=true is safe because it re-runs normal allocation logic.
Q: What is the difference between NoShardAvailableActionException and UnavailableShardsException?
A: NoShardAvailableActionException means no copy is in STARTED state for the targeted shard. UnavailableShardsException is thrown when the requested consistency level (e.g., quorum of replicas) cannot be met. They often appear together but indicate different cluster states.
Q: How long does Elasticsearch wait before retrying shard allocation?
A: Failed allocations back off exponentially. The cluster will retry up to index.allocation.max_retries times (default 5). After that, you must call POST /_cluster/reroute?retry_failed=true to retry.
Q: Can NoShardAvailableActionException occur on a green cluster?
A: Yes, briefly - during a shard relocation or initialization window, the target copy is not yet STARTED. Most clients retry transparently. Persistent occurrence on a green cluster usually points at a race between client routing decisions and master state updates.
Q: Will restarting Elasticsearch fix unassigned shards?
A: Sometimes, by triggering a fresh allocation pass. But restarting without diagnosis can mask the cause (disk pressure, filter misconfiguration) and risks data loss if it removes the only remaining copy. Always run _cluster/allocation/explain first.
Q: What's the fastest way to diagnose NoShardAvailableActionException in production?
A: Pulse, the AI DBA for Elasticsearch and OpenSearch, calls _cluster/allocation/explain for every affected shard, parses the per-node allocation decisions, and names the blocker (disk watermark, missing node, filter mismatch, corruption) in one view. It applies the safe remediation - watermark adjustment, retry reroute, filter clear - with approval and refuses to force-allocate an empty primary without explicit operator confirmation.
Related Reading
- Elasticsearch allocation explain API: the primary diagnostic tool for this error.
- Elasticsearch cluster block read-only low disk watermark: for disk-related allocation blocks.
- Elasticsearch cluster max shards per node: shard count cap.
- Elasticsearch IndexShardClosedException: for closed-shard errors.
- Elasticsearch monitoring: tracking unassigned shards proactively.