The cluster.routing.rebalance.enable setting controls whether Elasticsearch is allowed to rebalance already-assigned shards across data nodes. It defaults to all and accepts four values: all, primaries, replicas, none. Rebalancing differs from allocation: allocation places unassigned shards onto nodes, while rebalance moves shards that are already assigned to even out load. Toggling rebalance to none is the right move during heavy bulk indexing, large reindex operations, or planned maintenance, where shard movement would compete with the workload for I/O.
Definition
cluster.routing.rebalance.enable is a dynamic cluster-level setting that gates the rebalancer's decisions. It does not affect `cluster.routing.allocation.enable`, which controls the placement of unassigned shards. The rebalancer is what tries to keep the shard count and disk usage roughly even across nodes - turning it off freezes the cluster's current distribution.
Default and Allowed Values
| Value | Behaviour |
|---|---|
all (default) |
Rebalance both primary and replica shards |
primaries |
Rebalance only primary shards |
replicas |
Rebalance only replica shards |
none |
No rebalancing |
The setting is dynamic and takes effect at the next allocator pass (typically within seconds). Rebalancing operations already in flight will complete before the new value applies fully.
How to Change It
Through the cluster settings API:
# Pause all rebalancing during a large reindex job
PUT /_cluster/settings
{
"persistent": {
"cluster.routing.rebalance.enable": "none"
}
}
# Resume normal rebalancing
PUT /_cluster/settings
{
"persistent": {
"cluster.routing.rebalance.enable": null
}
}
Setting the value to null reverts to the default. Inspect with:
GET /_cluster/settings?include_defaults=true&filter_path=*.cluster.routing.rebalance.enable
You can pair this with related throttles, such as cluster.routing.allocation.cluster_concurrent_rebalance (default 2), which caps how many concurrent rebalance operations run cluster-wide.
When to Disable Rebalancing
Three common scenarios:
- Heavy bulk ingest or reindex. Rebalancing during a big write workload competes for disk and network. Disabling it for the duration of the job and re-enabling afterwards lets the cluster catch up on rebalancing in the quieter window.
- Planned cluster expansion in stages. When adding multiple nodes one at a time, disabling rebalance until all are joined avoids the cluster moving shards onto the first new node only to move them again when the second arrives.
- Maintenance windows. During a node-by-node restart, the rebalancer may try to redistribute shards as nodes come and go. Combine with
cluster.routing.allocation.enable: noneto freeze the cluster.
Re-enable as soon as the window closes. Long-term rebalancing-off accumulates imbalance: as indices are created and deleted, nodes drift apart in shard count and disk usage.
Operational Impact
The rebalancer's job is to maintain even shard distribution. When disabled:
- New indices and rollovers still place shards (allocation, not rebalance).
- Existing shards stay on their current node, even if a hot spot develops.
- Removing a node still triggers replica rebuilds (those are allocation, not rebalance).
- The
_cat/allocationview can show wide divergence in disk usage between nodes.
The rebalancer is conservative by design: it only moves a shard when the move improves balance more than it costs. Disabling it is rarely needed for short maintenance, but the cost of moving 100 GB of shard data during a 6-hour bulk reindex can be significant.
Common Mistakes
- Leaving rebalance off for weeks. Cluster drift accumulates and nodes diverge in disk usage and shard count.
- Disabling rebalance to mask shard hotspots. If one node consistently rebalances away from itself, investigate why - typically a shard sizing or routing problem.
- Confusing with allocation. Disabling rebalance does not prevent rebuilds when a node fails; that is allocation.
- Forgetting that
replicasdoes not include primaries. Settingreplicasstill allows primary movement only ifprimariesis also implicitly permitted byall; settingreplicasexplicitly excludes primaries.
Prevent Rebalance Drift with Pulse
Pulse is an AI DBA for Elasticsearch and OpenSearch that tracks cluster.routing.rebalance.enable and per-node shard distribution together, flagging:
- Drift between the operating default
alland any other value left in place after a bulk reindex, node addition, or maintenance window - Settings that are unsafe for your workload (e.g.
noneset weeks ago and forgotten, while nodes diverge in disk usage;replicasorprimariesset without a recovery scenario justifying it) - The downstream operational impact: shard count and disk usage divergence across data nodes shown in
_cat/allocation, hot-shard patterns hidden by disabled rebalance
When a node's disk usage starts pulling away from its peers because rebalance is off, Pulse names the cluster and the override before the imbalance becomes a capacity incident.
Frequently Asked Questions
Q: What is the default value of cluster.routing.rebalance.enable?
A: The default is all. The other accepted values are primaries, replicas, and none. The setting is dynamic and applies cluster-wide.
Q: How is cluster.routing.rebalance.enable different from cluster.routing.allocation.enable?
A: Allocation governs the placement of unassigned shards onto nodes (new indices, recoveries, node joins). Rebalance governs the relocation of already-assigned shards to even out load. Disabling rebalance does not prevent the cluster from rebuilding replicas after a node failure.
Q: When should I disable shard rebalancing?
A: During heavy bulk ingest or reindex jobs, while adding multiple nodes in stages, or during planned maintenance windows. The goal is to avoid shard movement competing with the workload. Always re-enable afterwards.
Q: Does disabling rebalance affect new shard allocation?
A: No. New shards (from PUT /index, rollovers, recoveries) are governed by cluster.routing.allocation.enable, not rebalance. Disabling rebalance only stops the movement of already-assigned shards.
Q: How quickly does a change to cluster.routing.rebalance.enable take effect?
A: The new value applies at the next allocator pass, typically within a few seconds. Rebalance operations already underway will complete before the new setting fully takes effect.
Q: Can I rebalance only primary or only replica shards?
A: Yes. Set the value to primaries to rebalance only primary shards, or replicas to rebalance only replica shards. This is rarely the right choice in production, but useful in narrow recovery scenarios.
Q: What's the best tool to catch disabled rebalance and node-level shard drift?
A: Pulse is built for this. It is an AI DBA for Elasticsearch and OpenSearch that tracks cluster.routing.rebalance.enable alongside per-node shard and disk distribution, and flags clusters where the setting was left at none after a maintenance window while nodes are diverging in capacity.
Related Reading
- Elasticsearch cluster.routing.allocation.enable Setting
- Elasticsearch cluster.routing.allocation.node_concurrent_recoveries
- Elasticsearch cluster.routing.allocation.node_concurrent_incoming_recoveries
- Elasticsearch Shard Sizing Best Practices
- Elasticsearch Cluster Health Check
- Elasticsearch Sizing Calculator