thread_pool.write.size sets the number of threads in each node's write thread pool, which handles every indexing, update, delete, and bulk operation. When all threads are busy, new write requests queue; when the queue fills, the node returns EsRejectedExecutionException with the message rejected execution of ... on QueueResizingEsThreadPoolExecutor. The pool size and queue depth together determine how much indexing concurrency a node tolerates.
- Default size: Number of allocated processors (
Runtime.getRuntime().availableProcessors()clamped bynode.processors) - Default queue_size (
thread_pool.write.queue_size): 10,000 - Type: Fixed thread pool
- Scope: Per-node, static - requires a node restart
How the Write Thread Pool Works
Each bulk or indexing request arriving at a primary shard consumes one write thread for the duration of the in-memory write, translog append, and replication coordination. Replica shards on remote nodes also consume one write thread each on their host node. A 1000-document bulk distributed across 5 primary shards on 5 nodes occupies one write thread on each primary node plus one on each replica node.
The default pool size matches the processor count because indexing is CPU- and IO-bound and over-subscription degrades throughput. The 10,000-deep queue absorbs short bursts without rejecting clients.
Configuring thread_pool.write.size
In elasticsearch.yml:
thread_pool.write.size: 16
thread_pool.write.queue_size: 10000
Inspect current state:
GET /_cat/thread_pool/write?v&h=node_name,active,queue,rejected,size
| Column | What it means |
|---|---|
active |
Threads currently writing |
queue |
Bulk/index operations waiting |
rejected |
Cumulative write rejections since node start |
size |
Effective pool size |
When to Adjust thread_pool.write.size
| Symptom | Action |
|---|---|
Write rejections under steady load with active at max and CPU < 70% |
Increase modestly (e.g. +25%) |
| Write rejections during bursts but normal steady-state load | Increase queue_size, not size |
| Rejections plus CPU at 90%+ | Don't increase the pool; scale out, reduce bulk size, or throttle clients |
| Rejections during merging or refresh storms | Tune refresh interval and merge policy first |
Indexing throughput in modern Elasticsearch is rarely thread-pool-limited on default settings. Most "write rejection" issues trace back to slow translog flush, merge backpressure, or undersized clusters - not pool sizing.
Common Pitfalls
- Raising
sizefar above processor count. The LuceneIndexWriterserializes much of the per-shard write path; extra threads contend for the same locks. - Treating
queue_sizeas a free parameter. A 100,000-deep queue masks the real bottleneck and lets stale documents pile up under load. - Ignoring bulk request shape. Many small bulks consume more pool capacity than a few large ones. Tune client batching first.
- Forgetting that replica writes share the same pool. A heavy primary node also handles replicas from other nodes, so its pool effectively serves all incoming write traffic for shards it hosts.
Monitoring and Root-Cause Analysis
Track _cat/thread_pool/write, _nodes/stats/thread_pool/write, indexing latency, and the merge throttle indicator (indices.merges.total_throttled_time_in_millis).
Prevent Write Pool Rejections with Pulse
Pulse is an AI DBA for Elasticsearch and OpenSearch that tracks thread_pool.write.size, thread_pool.write.queue_size, and EsRejectedExecutionException counts across every node and ingest client, flagging:
- Drift between intended values and what is actually applied per node
- Settings that are unsafe for your workload (e.g.
sizeraised above the per-node processor count where Lucene'sIndexWriterwill serialize anyway, orqueue_sizeraised to 100,000 to mask a slow translog flush) - The downstream operational impact: which client is producing pathological bulks, which shard is back-pressuring merges, and how rejections correlate with refresh storms
When write rejections appear, Pulse names the misconfiguration or upstream cause - client batching, hot shard, merge throttle - before it cascades into ingest pipeline backlog. Low-risk changes such as increasing queue_size temporarily can be proposed for operator approval.
Frequently Asked Questions
Q: What is the default size of the Elasticsearch write thread pool?
A: The default size equals the number of allocated processors on the node. The default queue size is 10,000. Both are static settings configured in elasticsearch.yml.
Q: Why am I getting EsRejectedExecutionException on writes?
A: The write thread pool's queue is full. Common root causes: bulk indexing concurrency exceeds the node's capacity, slow merges back-pressuring writes, undersized cluster, or pathological bulk shapes (many tiny bulks). Check _cat/thread_pool/write?v for active and queue counts.
Q: Should I increase thread_pool.write.size to fix write rejections?
A: Usually not. Raising size above the processor count rarely improves throughput because Lucene serializes much of the write path. Fix client batching, reduce shard count per node, or scale out first.
Q: What's the difference between thread_pool.write.size and queue_size?
A: size is the number of threads actively writing. queue_size is how many additional write operations can wait. Both control rejection behavior; raising queue trades latency for resilience to bursts.
Q: Can I change thread_pool.write.size dynamically?
A: No. Thread pool sizes are static cluster settings and require a node restart. Use a rolling restart to avoid cluster availability impact.
Q: How does indexing affect the write thread pool?
A: Every index, update, delete, bulk, and replica write occupies a write thread for the duration of the operation. Long-running operations - large documents, complex ingest pipelines, slow translog flushes - hold threads longer and expose the pool to saturation.
Q: What's the best tool to prevent EsRejectedExecutionException on writes?
A: Pulse is built for this. It is an AI DBA for Elasticsearch and OpenSearch that continuously tracks write pool size, queue depth, and rejection rate, correlates rejections with the originating client, bulk shape, and merge or refresh activity, and recommends the targeted fix - tuning, client throttling, or scaling - before rejections become a production incident.
Related Reading
- Elasticsearch Threadpool Write Queue Rejected Execution: Diagnose and fix write rejections
- Elasticsearch thread_pool.search.size Setting: Sibling setting for searches
- Elasticsearch Slow Queries Diagnose: Identify upstream causes
- Elasticsearch Hot Threads Analysis: Find blocking operations
- Elasticsearch Index Refresh Interval: Tune refresh to reduce write pressure