Elasticsearch PrimaryMissingActionException: Primary missing action - Common Causes & Fixes

Pulse - Elasticsearch Operations Done Right

On this page

Brief Explanation Impact Common Causes Troubleshooting and Resolution Steps Best Practices Frequently Asked Questions

Brief Explanation

The "PrimaryMissingActionException: Primary missing action" error in Elasticsearch occurs when an operation is attempted on a shard that doesn't have an active primary copy. This error indicates that the primary shard is unavailable or not allocated, preventing the cluster from performing the requested action.

Impact

This error can significantly impact the functionality and performance of your Elasticsearch cluster:

  • Data ingestion and indexing operations may fail
  • Search queries targeting the affected index may return incomplete results or fail entirely
  • Overall cluster health may be degraded

Common Causes

  1. Node failure or network issues causing the primary shard to become unavailable
  2. Insufficient disk space on nodes, preventing shard allocation
  3. Misconfigured cluster settings, particularly those related to shard allocation
  4. Recent cluster changes, such as node additions or removals, that haven't been properly handled

Troubleshooting and Resolution Steps

  1. Check cluster health:

    GET _cluster/health
    

    Look for indices with status yellow or red.

  2. Identify the affected index and shard:

    GET _cat/indices?v
    GET _cat/shards?v
    

    Focus on indices with unassigned shards.

  3. Investigate shard allocation issues:

    GET _cluster/allocation/explain
    

    This will provide detailed information about why shards are unassigned.

  4. Ensure all nodes are running and connected:

    GET _cat/nodes?v
    
  5. Check for disk space issues:

    GET _cat/allocation?v
    
  6. If the issue persists, try forcing a shard allocation:

    POST _cluster/reroute?retry_failed=true
    
  7. As a last resort, if data loss is acceptable, you can force the allocation of an empty primary shard:

    PUT _cluster/reroute
    {
      "commands": [
        {
          "allocate_empty_primary": {
            "index": "your_index_name",
            "shard": 0,
            "node": "target_node_name",
            "accept_data_loss": true
          }
        }
      ]
    }
    

Best Practices

  • Regularly monitor cluster health and shard allocation
  • Implement proper disk space monitoring and alerting
  • Use shard allocation filtering to ensure even distribution across nodes
  • Maintain an adequate number of replica shards for fault tolerance
  • Implement a robust backup strategy to mitigate data loss risks

Frequently Asked Questions

Q: Can I prevent PrimaryMissingActionException from occurring?
A: While you can't completely prevent it, you can minimize the risk by following best practices such as proper cluster sizing, regular monitoring, and maintaining adequate replica shards.

Q: Will I lose data if I force allocate an empty primary shard?
A: Yes, forcing an empty primary shard allocation will result in data loss for that shard. Only use this as a last resort when you have no other recovery options.

Q: How long does it take for Elasticsearch to recover from this error?
A: Recovery time varies depending on the cause, data size, and cluster resources. It can range from a few seconds for minor issues to hours for significant data recovery operations.

Q: Can this error occur in a single-node Elasticsearch cluster?
A: Yes, it can occur in a single-node cluster, especially if there are disk space issues or if the node becomes unresponsive.

Q: How does increasing the number of replica shards help prevent this error?
A: More replica shards increase fault tolerance. If a primary shard becomes unavailable, Elasticsearch can promote a replica to primary, reducing the likelihood of this error occurring.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.