The gateway.recover_after_nodes
setting in Elasticsearch controls the minimum number of nodes that must be available before the cluster recovery process can start. This setting is crucial for ensuring cluster stability during startup and recovery scenarios.
Description
- Default value: 0 (disabled)
- Possible values: Any non-negative integer
- Recommendation: Set to a value greater than half of your total expected nodes
The gateway.recover_after_nodes
setting works in conjunction with other gateway recovery settings to manage the cluster recovery process. When set to a value greater than 0, Elasticsearch will wait for at least this many nodes to be present in the cluster before initiating the recovery of the cluster state and data.
This setting is particularly useful in preventing a "split-brain" scenario where parts of the cluster recover independently, potentially leading to data inconsistencies.
This setting was deprecated in Elasticsearch 7.7.0 and removed in Elasticsearch 8.0.0.
Example
To set the gateway.recover_after_nodes
value to 3 using the cluster settings API:
PUT _cluster/settings
{
"persistent": {
"gateway.recover_after_nodes": 3
}
}
In this example, the cluster will wait for at least 3 nodes to be present before starting the recovery process. This can be beneficial in a 5-node cluster to ensure that a majority of nodes are available before proceeding with recovery.
Common Issues and Misuses
- Setting the value too high can prevent the cluster from recovering if the specified number of nodes cannot be reached.
- Setting the value too low (or leaving it at 0) may lead to premature recovery and potential split-brain scenarios in case of network partitions.
Do's and Don'ts
- Do set this value to at least (n/2) + 1, where n is the total number of nodes in your cluster.
- Don't set this value higher than the total number of nodes in your cluster.
- Do consider using this setting in conjunction with
gateway.expected_nodes
andgateway.recover_after_time
for more granular control over the recovery process. - Don't rely solely on this setting for cluster stability; ensure proper network configurations and other relevant settings are in place.
Frequently Asked Questions
Q: How does gateway.recover_after_nodes differ from discovery.zen.minimum_master_nodes?
A: While both settings relate to cluster stability, gateway.recover_after_nodes
controls the recovery process during cluster startup, whereas discovery.zen.minimum_master_nodes
(deprecated in newer versions) was used to prevent split-brain scenarios during normal operation.
Q: Can I change gateway.recover_after_nodes dynamically?
A: Yes, you can change this setting dynamically using the cluster settings API. However, the new value will only take effect during the next cluster restart or recovery process.
Q: What happens if the number of available nodes never reaches gateway.recover_after_nodes?
A: The cluster will not start the recovery process and remain in a waiting state until either the required number of nodes join or the gateway.recover_after_time
(if set) elapses.
Q: Should I use gateway.recover_after_nodes in a single-node cluster?
A: In a single-node cluster, it's generally not necessary to set this value as it's designed for multi-node clusters. Leaving it at the default (0) is appropriate for single-node setups.
Q: How does gateway.recover_after_nodes interact with other recovery settings?
A: This setting works in conjunction with gateway.expected_nodes
and gateway.recover_after_time
. The cluster will start recovery when either the recover_after_nodes
count is met or the recover_after_time
has elapsed, but only if at least recover_after_nodes
are present.