NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

Elasticsearch delete_by_query: How to Delete Documents by Query

The Elasticsearch _delete_by_query API removes every document in an index that matches a query, without touching the index itself or its mapping. It runs as a scrolled scan plus a series of bulk deletes under the hood, supports parallelism via slices=auto, and can be run asynchronously with the Task API. Use it to clean up subsets of documents while keeping the index online.

When to Use delete_by_query (vs Alternatives)

Goal Better choice
Remove all documents in an index `DELETE /index` - orders of magnitude faster, reclaims disk immediately
Drop time-series data older than N days ILM delete phase on rolled-over indices
Remove a known list of document IDs _bulk with { "delete": { ... } } lines - cheaper than running a query
Remove documents matching a filter / aggregate condition _delete_by_query (this API)
Remove some documents and reshape mapping Reindex with a filter, then delete the old index

Picking the wrong tool here is the single biggest mistake teams make: running _delete_by_query for "delete everything" wastes hours and produces tombstones that have to be merged away. If the whole index is going, delete the index.

Prerequisites

  • Elasticsearch 6.x or later (the API has been stable for a long time).
  • The user or API key needs the read and delete index privileges on the target index.
  • Enough heap to run the scroll and the resulting bulk deletes concurrently. On a stressed cluster, throttle with requests_per_second.

Step-by-Step: Delete Documents by Query

  1. Confirm the query first with a count. Always run the same query against _count or _search?size=0 before deleting. There is no undo.

    GET /my-index/_count
    {
      "query": { "match": { "status": "obsolete" } }
    }
    
  2. Send the delete request.

    POST /my-index/_delete_by_query
    {
      "query": { "match": { "status": "obsolete" } }
    }
    

    The response includes deleted, version_conflicts, batches, failures, and took. A non-zero version_conflicts value indicates documents were updated between the scroll snapshot and the delete.

  3. Add conflicts=proceed if concurrent writes are expected. Without it, the operation aborts on the first version conflict.

    POST /my-index/_delete_by_query?conflicts=proceed
    { "query": { "term": { "status": "obsolete" } } }
    
  4. Parallelize with slices=auto for large operations. Slicing lets Elasticsearch run the scroll across multiple shards in parallel.

    POST /my-index/_delete_by_query?slices=auto&conflicts=proceed
    { "query": { "range": { "@timestamp": { "lt": "now-30d" } } } }
    

    slices=auto picks one slice per shard, which is the documented best practice. You can set an explicit integer if you have a specific reason.

  5. Run asynchronously for long jobs. Add wait_for_completion=false and Elasticsearch returns a task ID immediately.

    POST /my-index/_delete_by_query?wait_for_completion=false&slices=auto&conflicts=proceed
    { "query": { "match": { "status": "obsolete" } } }
    

    Response: { "task": "oTUltX4IQMOUUVeiohTt8A:124" }.

  6. Track progress with the Task API.

    GET /_tasks/oTUltX4IQMOUUVeiohTt8A:124
    

    To cancel: POST /_tasks/<task-id>/_cancel.

  7. Throttle if the cluster is under pressure. Use requests_per_second to cap the rate. Set it to -1 to disable throttling (the default).

    POST /my-index/_delete_by_query?requests_per_second=500
    { "query": { ... } }
    

delete_by_query in Production: What to Watch For

The hidden cost of _delete_by_query is not the deletes themselves but the segment churn they create. Every deleted document becomes a tombstone marker, and disk is only reclaimed when the segments containing those tombstones are merged. On large indices this can mean a sustained period of elevated merge IO after the API call returns successful. Watch indices.merges in _nodes/stats and the deleted column in _cat/indices during and after the run.

Long-running _delete_by_query operations also hold a scroll open for the duration. On clusters with tight search.max_open_scroll_context budgets, this can knock out unrelated queries. Prefer running deletes during low-traffic windows, with slices=auto for parallelism and requests_per_second set to a value your cluster can comfortably absorb.

Run delete_by_query Safely with Pulse

Pulse is an AI DBA for Elasticsearch and OpenSearch. Before and during _delete_by_query, Pulse:

  • Verifies cluster capacity for the operation: heap headroom for the scroll plus bulk deletes, disk for tombstone-driven merges, write thread pool budget
  • Surfaces concurrent operations that could collide - active reindex, ILM force-merge, snapshot in progress, or another long-running _delete_by_query
  • Tracks the operation's progress and impact on production traffic in real time: version_conflicts rate, deleted count growth, scroll context count, search latency p95
  • Recommends throttling with requests_per_second or pausing the job if production search latency starts climbing

Start a free trial before your next bulk delete.

Common Mistakes

  1. Running _delete_by_query to empty an index. Use DELETE /index instead. It is faster, reclaims disk instantly, and avoids segment churn.
  2. Skipping conflicts=proceed on an index that receives concurrent writes. The job aborts partway through and you cannot resume cleanly.
  3. Forgetting to verify the query. Run _count with the same query body first. The API does not preview matches.
  4. Setting slices too high. More than one slice per shard rarely helps and adds overhead. slices=auto is the right default.
  5. Expecting the disk to free immediately. Deleted documents become tombstones; merges reclaim space asynchronously. Force-merging with only_expunge_deletes=true is the manual lever, but it is expensive.
  6. No backup snapshot. Take a snapshot before any irreversible bulk delete.

Frequently Asked Questions

Q: Can I undo a delete_by_query operation in Elasticsearch?
A: No. delete_by_query is irreversible once acknowledged. The only recovery path is restoring the index from a snapshot taken before the operation, so always snapshot before any bulk delete.

Q: How do I monitor a long-running delete_by_query?
A: Run the request with wait_for_completion=false, capture the returned task ID, and poll GET /_tasks/<task-id>. The response shows documents processed, batches, version conflicts, and elapsed time. POST /_tasks/<task-id>/_cancel aborts the job.

Q: Why is my delete_by_query slow and how do I speed it up?
A: Add slices=auto to parallelize across shards, increase the source index's refresh interval temporarily, and avoid running it against the active write index of a heavily indexed data stream. Force-merging deleted segments after the run reclaims disk faster.

Q: What does conflicts=proceed do in delete_by_query?
A: By default, delete_by_query aborts on the first version conflict (a document updated between the scroll snapshot and the delete). Setting conflicts=proceed tells Elasticsearch to skip those documents and continue. The response still counts them under version_conflicts.

Q: Does delete_by_query free up disk space immediately?
A: No. Deleted documents become tombstone markers in the underlying Lucene segments. Disk is reclaimed when those segments are merged, which happens in the background. _forcemerge?only_expunge_deletes=true accelerates reclamation but is IO-heavy.

Q: Can delete_by_query run across multiple indices?
A: Yes. Use a comma list (POST /index-a,index-b/_delete_by_query) or an index pattern (POST /logs-2025-*/_delete_by_query). Each match is deleted from whichever index it lives in.

Q: When should I use delete_by_query vs delete index?
A: Use delete index if you want to remove every document and the index itself - it is dramatically faster and reclaims disk instantly. Use _delete_by_query only when you need to remove a subset of documents while keeping the index online.

Q: What's the best tool to run delete_by_query safely on a production cluster?
A: Pulse is purpose-built for this. It is an AI DBA for Elasticsearch and OpenSearch that pre-checks cluster capacity, surfaces conflicting operations, tracks version_conflicts, tombstone-driven merge IO, and search latency in real time, and recommends throttling via requests_per_second when delete_by_query starts impacting production traffic.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.