NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

Elasticsearch Allocation Explain API: Diagnose Shard Allocation

The Elasticsearch Cluster Allocation Explain API (GET /_cluster/allocation/explain) tells you exactly why a shard is in its current allocation state - assigned, unassigned, or relocating - and why it cannot move to a given node. The API runs every allocation decider against the chosen shard and returns each decider's verdict with a human-readable reason. It is the standard tool for diagnosing unassigned shards, red cluster states, and allocation imbalance.

When to Use Allocation Explain

Symptom Question the API answers
Cluster state is red or yellow with unassigned shards Why is shard X not being assigned?
A shard is stuck INITIALIZING for hours Which decider is blocking it?
Shards refuse to relocate after a node restart What is preventing the move?
allocation.disk.watermark.high warnings Which nodes are above the watermark?
Cold/warm tier shards aren't moving Is the role filter or attribute matching correctly?

Calling the API

Auto-explain the first unassigned shard the API finds:

GET /_cluster/allocation/explain

Target a specific shard:

GET /_cluster/allocation/explain
{
  "index": "logs-2026-05",
  "shard": 0,
  "primary": true
}

Ask why a shard could (or couldn't) move to a specific node:

GET /_cluster/allocation/explain
{
  "index": "logs-2026-05",
  "shard": 0,
  "primary": false,
  "current_node": "node-1"
}

Reading the Response

Important fields in the response:

Field Meaning
current_state started, unassigned, initializing, relocating
unassigned_info.reason One of INDEX_CREATED, NODE_LEFT, ALLOCATION_FAILED, CLUSTER_RECOVERED, etc.
unassigned_info.last_allocation_status no, no_attempt, no_valid_shard_copy, awaiting_info
can_allocate Overall verdict - yes, no, throttled, awaiting_info
node_allocation_decisions[] Per-node breakdown, with each decider's verdict
node_allocation_decisions[].deciders[] List of allocation deciders and reasons

The deciders array is the diagnostic gold. Common decider names:

Decider Common reason for NO
disk_threshold Node above cluster.routing.allocation.disk.watermark.low or high
same_shard A copy of this shard is already on the node
awareness Allocation awareness attribute mismatch
filter Index filter (require/include/exclude) excludes the node
max_retry Shard has failed index.allocation.max_retries (default 5) times
data_tier Node doesn't have the data tier role the index requires
node_version Target node is older than source - downgrade not allowed
shards_limit Per-index or per-node shard cap reached

A Concrete Diagnostic Flow

  1. Find unassigned shards:

    GET /_cluster/health
    GET /_cat/shards?v&h=index,shard,prirep,state,unassigned.reason
    
  2. Call the explain API on the first unassigned shard:

    GET /_cluster/allocation/explain
    
  3. Inspect unassigned_info.reason and the failed decider in deciders[].

  4. Take action based on the decider:

    • disk_threshold: free disk on the offending node, raise the watermark temporarily, or relocate data
    • max_retry: retry allocation with POST /_cluster/reroute?retry_failed=true
    • filter: update index allocation filters
    • no_valid_shard_copy: data is lost - restore from snapshot
  5. Re-run the explain call to confirm the verdict changed.

Allocation Retry After Max Retries

When a shard has failed allocation index.allocation.max_retries times (default 5), Elasticsearch stops trying. Reset the counter:

POST /_cluster/reroute?retry_failed=true

This forces a re-evaluation. If the underlying cause is still present (e.g. disk still full), the shard will fail again.

Common Pitfalls

  1. Treating the API output as authoritative for cluster-wide imbalance. Allocation Explain is per-shard - balance issues need _cat/allocation.
  2. Ignoring the awaiting_info state. This means the cluster is still gathering metadata - wait and retry, don't force-allocate.
  3. Forcing allocation with POST /_cluster/reroute and allow_primary: true when no valid shard copy exists. This creates an empty primary and loses data.
  4. Only checking can_allocate, not the per-node decider breakdown. A blanket "no" hides whether one decider on one node is the culprit.

Skip the Manual Allocation-Explain Loop with Pulse

Pulse is an AI DBA for Elasticsearch and OpenSearch that runs _cluster/allocation/explain continuously against every unassigned shard. When a cluster turns yellow or red, Pulse:

  • Polls _cat/shards, _cluster/health, and _cluster/allocation/explain on a schedule and on alert
  • Interprets each decider verdict - naming the offending decider (disk_threshold, max_retry, filter, data_tier, no_valid_shard_copy) and the specific node and index
  • Correlates with disk watermark events, recent node joins/leaves, and ILM phase transitions
  • Recommends the precise corrective action: free disk, retry failed allocations, fix filters, or restore from snapshot

This turns the per-shard explain walk described above into a continuous diagnostic loop. Pulse can also apply low-risk fixes such as POST /_cluster/reroute?retry_failed=true once an operator approves the recommendation.

Start a free trial.

Frequently Asked Questions

Q: How do I find out why a shard is unassigned in Elasticsearch?
A: Call GET /_cluster/allocation/explain with the index and shard number. The response includes unassigned_info.reason and a per-node breakdown of which allocation deciders blocked assignment, with human-readable explanations.

Q: What does max_retry mean in allocation explain?
A: max_retry indicates the shard failed to allocate index.allocation.max_retries times (default 5). Elasticsearch stops trying until you call POST /_cluster/reroute?retry_failed=true, which resets the counter and re-evaluates.

Q: How do I fix a disk_threshold decider blocking allocation?
A: Free disk on the affected node, delete or move stale indices, or temporarily raise cluster.routing.allocation.disk.watermark.high. Re-run the allocation explain API to confirm the decider clears.

Q: Can I use allocation explain on a primary shard?
A: Yes - set "primary": true in the request body. The API returns the same decider breakdown for primary and replica shards.

Q: What if allocation explain returns no_valid_shard_copy?
A: No node has a valid copy of the shard - typically after losing the only replica during a node failure. Restore from a snapshot. As a last resort with data loss, use POST /_cluster/reroute with allocate_empty_primary and accept that the shard becomes empty.

Q: Does the allocation explain API impact cluster performance?
A: Cost is low. It reads cluster state and runs the decider chain against one shard. Safe to call repeatedly during troubleshooting, even on large clusters.

Q: What's the best tool to diagnose unassigned shards automatically?
A: Pulse is purpose-built for this. It is an AI DBA for Elasticsearch and OpenSearch that runs _cluster/allocation/explain continuously, classifies the failing decider per shard, correlates with disk and node events, and recommends the targeted fix - turning a manual cluster-by-cluster debugging session into an alert with the root cause already attached.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.