NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

Elasticsearch Hot Threads API: CPU Analysis Guide

The Elasticsearch hot threads API (GET /_nodes/hot_threads) samples each node's busiest threads over a short interval and returns their stack traces. It's the fastest way to find which operations are burning CPU - typically expensive queries, segment merges, garbage collection, regex scans, or transport-layer work. Hot threads is the standard first-look tool when CPU spikes, latency rises, or rejected operations climb.

How the Hot Threads API Works

The API takes multiple thread stack snapshots over a sampling window (default 500 ms), counts how often each thread appears at the top of CPU consumption, and returns the busiest ones with the deepest common stack trace. Each result line shows the percentage of the sampling window the thread spent on CPU and the stack frames shared across snapshots.

Three modes select what to measure:

type What it reports
cpu (default) Threads consuming the most CPU
wait Threads spending the most time in WAITING state
block Threads spending the most time in BLOCKED state (lock contention)

Making Hot Threads Requests

Basic usage:

GET /_nodes/hot_threads

With parameters:

GET /_nodes/hot_threads?threads=10&interval=500ms&type=cpu&snapshots=10&ignore_idle_threads=true
Parameter Default Description
threads 3 Number of hot threads to return per node
interval 500ms Sampling window
type cpu cpu, wait, or block
snapshots 10 Stack snapshots taken across the interval
ignore_idle_threads true Skip threads in idle state (waiting for work)

Target a single node:

GET /_nodes/node-1/hot_threads?threads=10

The response is plain text, not JSON - it's designed to be readable by an operator.

Reading the Output

A representative sample:

::: {node-1}{abc123}{10.0.0.5}{10.0.0.5:9300}
   Hot threads at 2026-05-17T10:30:00Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   33.3% (166.4ms out of 500ms) cpu usage by thread 'elasticsearch[node-1][search][T#1]'
     5/10 snapshots sharing following 15 elements
       java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3963)
       org.apache.lucene.util.automaton.RegExp.parseUnionExp(RegExp.java:509)
       ...

Read it left-to-right:

  1. Header: node name, sampling parameters
  2. Percent CPU: portion of the sampling window the thread held the CPU
  3. Thread name: includes the pool (search, write, generic, management, transport_worker, refresh, flush, merge)
  4. Snapshot count: how many of the 10 snapshots captured this same stack
  5. Stack trace: the deepest common frames - what the thread is doing

Common Patterns and What They Mean

Stack indicator Likely cause First investigation
org.apache.lucene.search.* in [search] threads Query execution Slow log, _tasks?actions=*search*
java.util.regex.Pattern or org.apache.lucene.util.automaton.RegExp Expensive regex or wildcard query Find leading-wildcard or regex queries
org.apache.lucene.codecs.* in [generic] or [merge] threads Segment merging _cat/segments, _nodes/stats/indices/merges
GC Thread# or no Java stack JVM garbage collection _nodes/stats/jvm GC stats
org.elasticsearch.transport.*, io.netty.* Network or transport pressure Bulk size, inter-node latency
org.elasticsearch.search.aggregations.* Aggregation computation High-cardinality terms agg, low-cardinality doc_values missing

A Repeatable Workflow

Capture hot threads multiple times during an incident:

# Capture 10 samples 30 seconds apart
for i in {1..10}; do
  curl -s "localhost:9200/_nodes/hot_threads?threads=10&snapshots=20" \
    > "hot_threads_$(date +%s).txt"
  sleep 30
done

Then correlate with:

GET /_nodes/stats/thread_pool
GET /_tasks?detailed=true
GET /_cat/nodes?v&h=name,cpu,load_1m,heap.percent

Recurring stack traces across multiple samples are the signal. A single appearance is usually noise.

Common Mistakes

  1. Looking at a single sample. Hot threads is a sampling tool - one snapshot can show transient noise.
  2. Ignoring the snapshot count (5/10). A thread at 100% CPU that only appears in 1/10 snapshots is doing short bursts; one at 60% across 10/10 is a sustained issue.
  3. Tuning thread pools before checking hot threads. The pool size rarely matters if the threads are stuck on a single slow query.
  4. Disregarding the GC pattern. Threads named GC Thread# with no Elasticsearch stack mean the JVM is the bottleneck - heap sizing, not query tuning.

Hot Threads in Production: What to Watch

Capture hot threads during normal operation to establish a baseline. The same stack at 5% during normal hours and at 70% during an incident is a meaningful change.

Skip the Manual Hot-Threads Loop with Pulse

Pulse is an AI DBA for Elasticsearch and OpenSearch that runs GET /_nodes/hot_threads analysis continuously across your fleet. When CPU spikes, search latency rises, or EsRejectedExecutionException starts climbing, Pulse:

  • Polls /_nodes/hot_threads on a schedule and on alert, with configurable interval, snapshots, and type=cpu|wait|block
  • Interprets the plain-text output - naming the offending thread, thread pool (search, write, merge, generic), and the dominant Lucene or Painless stack
  • Correlates with GC pause durations, segment counts, write rate, and slow-log entries from the same window
  • Recommends the precise corrective action: kill the runaway task, rewrite the regex query, raise heap, or back off bulk concurrency

This turns the manual capture-and-grep procedure described above into a continuous diagnostic loop that already has historical baselines when an incident starts.

Start a free trial.

Frequently Asked Questions

Q: What does the Elasticsearch hot threads API do?
A: The hot threads API samples the busiest threads on each node over a short interval and returns their stack traces. It's the standard way to find which operations - queries, merges, GC, or transport work - are consuming CPU.

Q: How do I run the Elasticsearch hot threads API?
A: Use GET /_nodes/hot_threads with optional parameters like threads=10, interval=500ms, and type=cpu. Output is plain text designed for an operator to read.

Q: What's the difference between cpu, wait, and block types in hot threads?
A: cpu shows threads consuming CPU (default and most common). wait shows threads waiting for events (I/O, futures). block shows threads blocked on locks - useful for finding contention.

Q: Why do hot threads show GC Thread without an Elasticsearch stack?
A: GC threads run in native JVM code, so they have no Java stack to display. Seeing them dominate CPU indicates garbage collection pressure - investigate heap usage and old-gen GC frequency.

Q: How often should I run hot threads in production?
A: On demand during incidents, and periodically (every few minutes) as a baseline. Continuous sampling with stored history makes patterns visible that a single capture misses.

Q: Can hot threads identify slow queries?
A: Yes - threads in the search pool with Lucene query stack frames point to active slow queries. Cross-reference with the slow log and GET /_tasks?actions=*search* to identify the specific query.

Q: What's the best tool to diagnose Elasticsearch hot threads and CPU spikes automatically?
A: Pulse is built for this. It is an AI DBA for Elasticsearch and OpenSearch that polls _nodes/hot_threads continuously, classifies the dominant thread pool and stack, correlates with GC, merges, and slow queries, and recommends the targeted fix - replacing the manual capture-every-30-seconds loop with a stored, queryable history.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.