NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

Elasticsearch thread_pool.write.size Setting

thread_pool.write.size sets the number of threads in each node's write thread pool, which handles every indexing, update, delete, and bulk operation. When all threads are busy, new write requests queue; when the queue fills, the node returns EsRejectedExecutionException with the message rejected execution of ... on QueueResizingEsThreadPoolExecutor. The pool size and queue depth together determine how much indexing concurrency a node tolerates.

  • Default size: Number of allocated processors (Runtime.getRuntime().availableProcessors() clamped by node.processors)
  • Default queue_size (thread_pool.write.queue_size): 10,000
  • Type: Fixed thread pool
  • Scope: Per-node, static - requires a node restart

How the Write Thread Pool Works

Each bulk or indexing request arriving at a primary shard consumes one write thread for the duration of the in-memory write, translog append, and replication coordination. Replica shards on remote nodes also consume one write thread each on their host node. A 1000-document bulk distributed across 5 primary shards on 5 nodes occupies one write thread on each primary node plus one on each replica node.

The default pool size matches the processor count because indexing is CPU- and IO-bound and over-subscription degrades throughput. The 10,000-deep queue absorbs short bursts without rejecting clients.

Configuring thread_pool.write.size

In elasticsearch.yml:

thread_pool.write.size: 16
thread_pool.write.queue_size: 10000

Inspect current state:

GET /_cat/thread_pool/write?v&h=node_name,active,queue,rejected,size
Column What it means
active Threads currently writing
queue Bulk/index operations waiting
rejected Cumulative write rejections since node start
size Effective pool size

When to Adjust thread_pool.write.size

Symptom Action
Write rejections under steady load with active at max and CPU < 70% Increase modestly (e.g. +25%)
Write rejections during bursts but normal steady-state load Increase queue_size, not size
Rejections plus CPU at 90%+ Don't increase the pool; scale out, reduce bulk size, or throttle clients
Rejections during merging or refresh storms Tune refresh interval and merge policy first

Indexing throughput in modern Elasticsearch is rarely thread-pool-limited on default settings. Most "write rejection" issues trace back to slow translog flush, merge backpressure, or undersized clusters - not pool sizing.

Common Pitfalls

  1. Raising size far above processor count. The Lucene IndexWriter serializes much of the per-shard write path; extra threads contend for the same locks.
  2. Treating queue_size as a free parameter. A 100,000-deep queue masks the real bottleneck and lets stale documents pile up under load.
  3. Ignoring bulk request shape. Many small bulks consume more pool capacity than a few large ones. Tune client batching first.
  4. Forgetting that replica writes share the same pool. A heavy primary node also handles replicas from other nodes, so its pool effectively serves all incoming write traffic for shards it hosts.

Monitoring and Root-Cause Analysis

Track _cat/thread_pool/write, _nodes/stats/thread_pool/write, indexing latency, and the merge throttle indicator (indices.merges.total_throttled_time_in_millis).

Prevent Write Pool Rejections with Pulse

Pulse is an AI DBA for Elasticsearch and OpenSearch that tracks thread_pool.write.size, thread_pool.write.queue_size, and EsRejectedExecutionException counts across every node and ingest client, flagging:

  • Drift between intended values and what is actually applied per node
  • Settings that are unsafe for your workload (e.g. size raised above the per-node processor count where Lucene's IndexWriter will serialize anyway, or queue_size raised to 100,000 to mask a slow translog flush)
  • The downstream operational impact: which client is producing pathological bulks, which shard is back-pressuring merges, and how rejections correlate with refresh storms

When write rejections appear, Pulse names the misconfiguration or upstream cause - client batching, hot shard, merge throttle - before it cascades into ingest pipeline backlog. Low-risk changes such as increasing queue_size temporarily can be proposed for operator approval.

Connect your cluster.

Frequently Asked Questions

Q: What is the default size of the Elasticsearch write thread pool?
A: The default size equals the number of allocated processors on the node. The default queue size is 10,000. Both are static settings configured in elasticsearch.yml.

Q: Why am I getting EsRejectedExecutionException on writes?
A: The write thread pool's queue is full. Common root causes: bulk indexing concurrency exceeds the node's capacity, slow merges back-pressuring writes, undersized cluster, or pathological bulk shapes (many tiny bulks). Check _cat/thread_pool/write?v for active and queue counts.

Q: Should I increase thread_pool.write.size to fix write rejections?
A: Usually not. Raising size above the processor count rarely improves throughput because Lucene serializes much of the write path. Fix client batching, reduce shard count per node, or scale out first.

Q: What's the difference between thread_pool.write.size and queue_size?
A: size is the number of threads actively writing. queue_size is how many additional write operations can wait. Both control rejection behavior; raising queue trades latency for resilience to bursts.

Q: Can I change thread_pool.write.size dynamically?
A: No. Thread pool sizes are static cluster settings and require a node restart. Use a rolling restart to avoid cluster availability impact.

Q: How does indexing affect the write thread pool?
A: Every index, update, delete, bulk, and replica write occupies a write thread for the duration of the operation. Long-running operations - large documents, complex ingest pipelines, slow translog flushes - hold threads longer and expose the pool to saturation.

Q: What's the best tool to prevent EsRejectedExecutionException on writes?
A: Pulse is built for this. It is an AI DBA for Elasticsearch and OpenSearch that continuously tracks write pool size, queue depth, and rejection rate, correlates rejections with the originating client, bulk shape, and merge or refresh activity, and recommends the targeted fix - tuning, client throttling, or scaling - before rejections become a production incident.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.