Elasticsearch MasterNotDiscoveredException - Common Causes & Fixes

Brief Explanation

The MasterNotDiscoveredException in Elasticsearch occurs when a node in the cluster is unable to discover or connect to the master node. This error indicates a serious issue with cluster formation and communication.

Impact

This error has a significant impact on cluster operations:

  • New nodes cannot join the cluster
  • Existing nodes may become isolated
  • Cluster state updates are impossible
  • Indexing and search operations may fail
  • Overall cluster stability is compromised

Common Causes

  1. Network connectivity issues between nodes
  2. Misconfigured discovery settings
  3. Firewall or security group restrictions
  4. Insufficient master-eligible nodes
  5. Split-brain scenario due to network partitions

Troubleshooting and Resolution Steps

  1. Check network connectivity between all nodes

    • Use tools like ping or telnet to verify connectivity
  2. Verify discovery settings in elasticsearch.yml

    • Ensure discovery.seed_hosts is correctly configured
    • Check cluster.initial_master_nodes for bootstrap
  3. Review firewall and security group rules

    • Ensure ports 9200-9300 are open between nodes
  4. Inspect Elasticsearch logs for specific error messages

    • Look for entries related to discovery and master election
  5. Ensure sufficient master-eligible nodes are available

    • Configure at least three master-eligible nodes for production
  6. Check cluster health and state

    • Use GET /_cluster/health and GET /_cluster/state API calls
  7. Restart nodes if necessary

    • Start master-eligible nodes first, followed by data nodes
  8. Consider increasing discovery timeouts

    • Adjust discovery.zen.ping_timeout and discovery.zen.join_timeout

Best Practices

  • Always have an odd number of master-eligible nodes (3 or more) in production
  • Use dedicated master nodes in large clusters
  • Implement proper network segmentation and security measures
  • Regularly monitor cluster health and perform maintenance

Frequently Asked Questions

Q: Can I have only one master node in my Elasticsearch cluster?
A: While it's technically possible, it's not recommended for production environments. Having at least three master-eligible nodes ensures better fault tolerance and prevents split-brain scenarios.

Q: How does Elasticsearch elect a master node?
A: Elasticsearch uses a process called "Zen Discovery" to elect a master node. Nodes communicate with each other, and the node with the lowest node ID among eligible nodes becomes the master.

Q: What's the difference between master-eligible nodes and dedicated master nodes?
A: Master-eligible nodes can perform all cluster tasks, including data operations. Dedicated master nodes are configured to only handle cluster management tasks and do not store or process data.

Q: How can I prevent split-brain scenarios in Elasticsearch?
A: To prevent split-brain scenarios, ensure you have an odd number of master-eligible nodes, properly configure discovery.zen.minimum_master_nodes (for versions before 7.0), and use a good network infrastructure with low latency between nodes.

Q: What should I do if I can't resolve the MasterNotDiscoveredException?
A: If you've tried all troubleshooting steps and still can't resolve the issue, you may need to perform a cluster restart. Start with master-eligible nodes, ensure they form a cluster, then gradually add data nodes. If problems persist, consult Elasticsearch support or community forums for advanced assistance.

Pulse - Elasticsearch Operations Done Right
Free Health Assessment

Need more help with your cluster?

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.