Elasticsearch Reindex Performance Tuning and Throttling

The _reindex API copies documents from a source index to a destination index. Internally it works as a scroll query on the source combined with bulk indexing into the destination. Each batch is read via scroll, then written via a _bulk request. This means reindex performance is bounded by both the read speed of the source and the write throughput of the destination. Understanding the internal mechanics helps identify where bottlenecks occur and which tuning parameters to adjust.

By default, _reindex uses batches of 1000 documents. Each scroll request fetches a batch, which is then sent as a single bulk request to the destination. The operation runs synchronously, meaning the API call blocks until all documents are copied. For large indices, this can take hours or days, making async execution and progress monitoring a necessity.

Throttling with requests_per_second and _rethrottle

The requests_per_second parameter controls how fast reindex operates by injecting wait time between batches. Setting it to a value like 500 means Elasticsearch targets 500 document writes per second. The throttle works by calculating the time a batch took, then sleeping for the remainder needed to hit the target rate. Set it to -1 to disable throttling entirely.

POST _reindex?requests_per_second=500
{
  "source": { "index": "old-index" },
  "dest": { "index": "new-index" }
}

Throttling can be adjusted on a running reindex operation using the _rethrottle API. This is useful when you start conservatively and want to speed up after confirming the cluster handles the load. Rethrottling that increases the rate takes effect immediately. Rethrottling that decreases the rate takes effect after the current batch completes.

POST _reindex/TASK_ID/_rethrottle?requests_per_second=1000

When using sliced reindex (covered below), rethrottling the parent task distributes the new rate proportionally across all sub-tasks. You can also rethrottle individual sub-tasks by their own task IDs.

Sliced Scrolling for Parallelism

By default, reindex uses a single scroll to read the source index sequentially. The slices parameter splits the operation into multiple parallel scroll queries, each processing a different subset of the data. Each slice runs independently, reading and writing its own batch of documents.

POST _reindex
{
  "source": { "index": "old-index" },
  "dest": { "index": "new-index" },
  "slices": 5
}

Setting slices to auto lets Elasticsearch choose the number of slices, typically one per shard in the source index up to a reasonable limit. This is the simplest way to parallelize and works well for most cases. Manually setting slices higher than the shard count can still help, but the distribution of documents becomes less even.

Each slice is a separate task visible in the tasks API. The slices don't share state, so they can run on different nodes. For very large reindex operations across many shards, slices: auto can cut total time dramatically compared to a single-threaded scroll.

One caveat: setting slices much higher than the number of shards creates overhead without proportional benefit. There is a known behavior where very large scroll searches with slices exceeding the shard count can actually slow down rather than speed up the operation.

Batch Size and Pipeline Transforms

The size parameter within the source object controls the scroll batch size - the number of documents fetched per scroll request. The default is 1000. Increasing it reduces the number of round trips but creates larger bulk requests that consume more memory on the coordinating node.

POST _reindex
{
  "source": {
    "index": "old-index",
    "size": 5000
  },
  "dest": {
    "index": "new-index",
    "pipeline": "my-transform-pipeline"
  }
}

The pipeline parameter in the dest object specifies an ingest pipeline to apply to each document during reindexing. This lets you transform data as it moves - renaming fields, converting types, enriching with lookups, or dropping documents that match certain conditions. The pipeline runs on the ingest node handling each bulk request. Heavy pipeline processing (such as grok parsing or enrich lookups) becomes a bottleneck, so factor this into your performance expectations.

The source can also include a query to reindex only a subset of documents. Overly broad queries that match most of the index gain nothing over reindexing without a query, but selective queries can drastically reduce the work.

Remote Reindex

Reindexing from a different cluster requires configuring reindex.remote.whitelist in elasticsearch.yml on the destination cluster. This setting accepts a comma-separated list of host:port entries. A restart is required after changing it.

reindex.remote.whitelist: "source-cluster:9200"

POST _reindex
{
  "source": {
    "remote": {
      "host": "https://source-cluster:9200",
      "username": "user",
      "password": "pass"
    },
    "index": "remote-index"
  },
  "dest": { "index": "local-index" }
}

Remote reindex reads from the source cluster via HTTP, so network bandwidth and latency between clusters are limiting factors. The source cluster only needs to support scroll queries. The destination cluster handles all indexing load. Sliced scrolling works with remote reindex but each slice opens its own connection to the remote cluster.

Monitoring, Cancellation, and Common Bottlenecks

For long-running reindex operations, set wait_for_completion=false to run the operation asynchronously. The API returns a task ID immediately instead of blocking.

POST _reindex?wait_for_completion=false
{
  "source": { "index": "old-index" },
  "dest": { "index": "new-index" }
}

Monitor progress with the tasks API using the returned task ID:

GET _tasks/NODE_ID:TASK_NUMBER

The response includes the total number of documents, how many have been created, updated, and deleted, plus timing information. For sliced reindex, the parent task shows aggregate progress while each sub-task reports its own slice.

Cancel a running reindex with:

POST _tasks/NODE_ID:TASK_NUMBER/_cancel

Cancellation is not instantaneous. The current batch completes before the task stops. With sliced reindex, cancelling the parent task cancels all sub-tasks.

Common bottlenecks to watch for: the source query being too broad when you only need a subset of documents, the destination mapping triggering heavy analysis (complex analyzers on text fields slow down indexing significantly), and the default refresh interval on the destination index causing frequent segment creation. Setting "refresh_interval": "-1" on the destination index during reindex and restoring it afterward avoids this overhead. Increasing "number_of_replicas" to 0 during reindex and restoring it after completion also helps, since replicas duplicate all indexing work.