The Elasticsearch hot threads API (GET /_nodes/hot_threads) samples each node's busiest threads over a short interval and returns their stack traces. It's the fastest way to find which operations are burning CPU - typically expensive queries, segment merges, garbage collection, regex scans, or transport-layer work. Hot threads is the standard first-look tool when CPU spikes, latency rises, or rejected operations climb.
How the Hot Threads API Works
The API takes multiple thread stack snapshots over a sampling window (default 500 ms), counts how often each thread appears at the top of CPU consumption, and returns the busiest ones with the deepest common stack trace. Each result line shows the percentage of the sampling window the thread spent on CPU and the stack frames shared across snapshots.
Three modes select what to measure:
type |
What it reports |
|---|---|
cpu (default) |
Threads consuming the most CPU |
wait |
Threads spending the most time in WAITING state |
block |
Threads spending the most time in BLOCKED state (lock contention) |
Making Hot Threads Requests
Basic usage:
GET /_nodes/hot_threads
With parameters:
GET /_nodes/hot_threads?threads=10&interval=500ms&type=cpu&snapshots=10&ignore_idle_threads=true
| Parameter | Default | Description |
|---|---|---|
threads |
3 | Number of hot threads to return per node |
interval |
500ms | Sampling window |
type |
cpu |
cpu, wait, or block |
snapshots |
10 | Stack snapshots taken across the interval |
ignore_idle_threads |
true | Skip threads in idle state (waiting for work) |
Target a single node:
GET /_nodes/node-1/hot_threads?threads=10
The response is plain text, not JSON - it's designed to be readable by an operator.
Reading the Output
A representative sample:
::: {node-1}{abc123}{10.0.0.5}{10.0.0.5:9300}
Hot threads at 2026-05-17T10:30:00Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
33.3% (166.4ms out of 500ms) cpu usage by thread 'elasticsearch[node-1][search][T#1]'
5/10 snapshots sharing following 15 elements
java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3963)
org.apache.lucene.util.automaton.RegExp.parseUnionExp(RegExp.java:509)
...
Read it left-to-right:
- Header: node name, sampling parameters
- Percent CPU: portion of the sampling window the thread held the CPU
- Thread name: includes the pool (
search,write,generic,management,transport_worker,refresh,flush,merge) - Snapshot count: how many of the 10 snapshots captured this same stack
- Stack trace: the deepest common frames - what the thread is doing
Common Patterns and What They Mean
| Stack indicator | Likely cause | First investigation |
|---|---|---|
org.apache.lucene.search.* in [search] threads |
Query execution | Slow log, _tasks?actions=*search* |
java.util.regex.Pattern or org.apache.lucene.util.automaton.RegExp |
Expensive regex or wildcard query | Find leading-wildcard or regex queries |
org.apache.lucene.codecs.* in [generic] or [merge] threads |
Segment merging | _cat/segments, _nodes/stats/indices/merges |
GC Thread# or no Java stack |
JVM garbage collection | _nodes/stats/jvm GC stats |
org.elasticsearch.transport.*, io.netty.* |
Network or transport pressure | Bulk size, inter-node latency |
org.elasticsearch.search.aggregations.* |
Aggregation computation | High-cardinality terms agg, low-cardinality doc_values missing |
A Repeatable Workflow
Capture hot threads multiple times during an incident:
# Capture 10 samples 30 seconds apart
for i in {1..10}; do
curl -s "localhost:9200/_nodes/hot_threads?threads=10&snapshots=20" \
> "hot_threads_$(date +%s).txt"
sleep 30
done
Then correlate with:
GET /_nodes/stats/thread_pool
GET /_tasks?detailed=true
GET /_cat/nodes?v&h=name,cpu,load_1m,heap.percent
Recurring stack traces across multiple samples are the signal. A single appearance is usually noise.
Common Mistakes
- Looking at a single sample. Hot threads is a sampling tool - one snapshot can show transient noise.
- Ignoring the snapshot count (
5/10). A thread at 100% CPU that only appears in 1/10 snapshots is doing short bursts; one at 60% across 10/10 is a sustained issue. - Tuning thread pools before checking hot threads. The pool size rarely matters if the threads are stuck on a single slow query.
- Disregarding the GC pattern. Threads named
GC Thread#with no Elasticsearch stack mean the JVM is the bottleneck - heap sizing, not query tuning.
Hot Threads in Production: What to Watch
Capture hot threads during normal operation to establish a baseline. The same stack at 5% during normal hours and at 70% during an incident is a meaningful change.
Skip the Manual Hot-Threads Loop with Pulse
Pulse is an AI DBA for Elasticsearch and OpenSearch that runs GET /_nodes/hot_threads analysis continuously across your fleet. When CPU spikes, search latency rises, or EsRejectedExecutionException starts climbing, Pulse:
- Polls
/_nodes/hot_threadson a schedule and on alert, with configurableinterval,snapshots, andtype=cpu|wait|block - Interprets the plain-text output - naming the offending thread, thread pool (
search,write,merge,generic), and the dominant Lucene or Painless stack - Correlates with GC pause durations, segment counts, write rate, and slow-log entries from the same window
- Recommends the precise corrective action: kill the runaway task, rewrite the regex query, raise heap, or back off bulk concurrency
This turns the manual capture-and-grep procedure described above into a continuous diagnostic loop that already has historical baselines when an incident starts.
Frequently Asked Questions
Q: What does the Elasticsearch hot threads API do?
A: The hot threads API samples the busiest threads on each node over a short interval and returns their stack traces. It's the standard way to find which operations - queries, merges, GC, or transport work - are consuming CPU.
Q: How do I run the Elasticsearch hot threads API?
A: Use GET /_nodes/hot_threads with optional parameters like threads=10, interval=500ms, and type=cpu. Output is plain text designed for an operator to read.
Q: What's the difference between cpu, wait, and block types in hot threads?
A: cpu shows threads consuming CPU (default and most common). wait shows threads waiting for events (I/O, futures). block shows threads blocked on locks - useful for finding contention.
Q: Why do hot threads show GC Thread without an Elasticsearch stack?
A: GC threads run in native JVM code, so they have no Java stack to display. Seeing them dominate CPU indicates garbage collection pressure - investigate heap usage and old-gen GC frequency.
Q: How often should I run hot threads in production?
A: On demand during incidents, and periodically (every few minutes) as a baseline. Continuous sampling with stored history makes patterns visible that a single capture misses.
Q: Can hot threads identify slow queries?
A: Yes - threads in the search pool with Lucene query stack frames point to active slow queries. Cross-reference with the slow log and GET /_tasks?actions=*search* to identify the specific query.
Q: What's the best tool to diagnose Elasticsearch hot threads and CPU spikes automatically?
A: Pulse is built for this. It is an AI DBA for Elasticsearch and OpenSearch that polls _nodes/hot_threads continuously, classifies the dominant thread pool and stack, correlates with GC, merges, and slow queries, and recommends the targeted fix - replacing the manual capture-every-30-seconds loop with a stored, queryable history.
Related Reading
- Elasticsearch Crashing High CPU: Crash investigation
- Elasticsearch Node CPU Spikes Investigation: Spike triage
- Elasticsearch Slow Queries Diagnose: Identify slow queries
- Elasticsearch Performance Issues Troubleshooting: Performance overview
- Elasticsearch JVM GC Freeze Analysis: GC freeze triage
- Elasticsearch Allocation Explain API: Cluster allocation troubleshooting