The cluster.persistent_tasks.allocation.recheck_interval setting controls how frequently Elasticsearch rechecks the allocation of persistent tasks in the cluster. This setting determines the interval at which the cluster manager node reassesses the assignment of persistent tasks to nodes, ensuring optimal distribution and execution of these long-running operations.
Description
- Default value: 30 seconds
- Possible values: Time value (e.g., 30s, 1m, 5m)
- Recommendation: The default value is suitable for most clusters. However, you may want to increase this interval in large clusters with many persistent tasks to reduce the overhead of frequent rechecks.
This setting is particularly important for managing the allocation of persistent tasks such as reindexing, data frame analytics jobs, or machine learning jobs. By adjusting this interval, you can balance between responsiveness to changes in the cluster state and the computational overhead of frequent rechecks.
Example
To change the recheck interval to 1 minute using the cluster settings API:
PUT _cluster/settings
{
  "persistent": {
    "cluster.persistent_tasks.allocation.recheck_interval": "1m"
  }
}
You might want to increase this interval in a large cluster with many persistent tasks to reduce the load on the cluster manager node. However, be aware that a longer interval may result in slower reallocation of tasks when cluster conditions change.
Common Issues and Misuses
- Setting the interval too low can increase the load on the cluster manager node, especially in large clusters with many persistent tasks.
- Setting the interval too high may lead to delayed reallocation of tasks when nodes join or leave the cluster, potentially impacting the execution of important operations.
Do's and Don'ts
Do's:
- Monitor cluster performance and adjust this setting if you notice high load on the cluster manager node due to frequent task allocation checks.
- Consider increasing the interval in very large clusters or environments with a high number of persistent tasks.
- Align this setting with your cluster's stability and the frequency of node changes.
Don'ts:
- Don't set this value too low (e.g., below 10 seconds) as it may unnecessarily increase the load on your cluster.
- Avoid setting extremely high values (e.g., hours) as it may significantly delay the reallocation of tasks when cluster conditions change.
- Don't change this setting without monitoring its impact on cluster performance and task allocation efficiency.
Frequently Asked Questions
Q: How does this setting affect the responsiveness of task allocation in my cluster? 
A: A lower value makes the cluster more responsive to changes, potentially reallocating tasks more quickly when nodes join or leave. However, it also increases the computational overhead. A higher value reduces this overhead but may delay task reallocation.
Q: Can changing this setting impact ongoing persistent tasks? 
A: Changing this setting doesn't directly impact ongoing tasks, but it may affect how quickly tasks are reallocated if a node becomes unavailable or if new nodes join the cluster.
Q: Is there a recommended value for very large clusters? 
A: For very large clusters (100+ nodes) with many persistent tasks, you might consider increasing this to 1-5 minutes (1m-5m) to reduce overhead, but always monitor the impact of such changes.
Q: How does this setting interact with other task allocation settings? 
A: This setting works in conjunction with other allocation settings but specifically controls the frequency of rechecks. Other settings like cluster.routing.allocation.enable control whether allocation is allowed at all.
Q: Can this setting be changed dynamically? 
A: Yes, this setting can be changed dynamically using the cluster settings API without requiring a cluster restart.
