Elasticsearch Slow Log Configuration

Elasticsearch's slow log captures search and indexing operations that exceed time thresholds you define. Unlike cluster-level monitoring metrics that show averages and percentiles, slow logs give you the actual query body and indexing source for individual slow operations. They are the primary tool for identifying specific problematic queries and bulk requests. Slow logs are disabled by default - all thresholds are set to -1.

Search Slow Log Thresholds

Search slow logs operate at the shard level and split into two phases: query and fetch. The query phase scores and collects matching document IDs across shards. The fetch phase retrieves the actual document content for the top hits. You set thresholds independently for each phase and for each severity level:

PUT /my-index/_settings
{
  "index.search.slowlog.threshold.query.warn": "10s",
  "index.search.slowlog.threshold.query.info": "5s",
  "index.search.slowlog.threshold.query.debug": "2s",
  "index.search.slowlog.threshold.query.trace": "500ms",
  "index.search.slowlog.threshold.fetch.warn": "1s",
  "index.search.slowlog.threshold.fetch.info": "800ms",
  "index.search.slowlog.threshold.fetch.debug": "500ms",
  "index.search.slowlog.threshold.fetch.trace": "200ms"
}

These thresholds measure wall-clock time per shard, not the total request time the client sees. A query that takes 12 seconds end-to-end might show up as a 6-second entry in the slow log if the work was split across two shards. Keep that in mind when setting thresholds - a 5s per-shard threshold captures queries that may take 10s or more from the client perspective on a multi-shard index.

The index.search.slowlog.level setting controls the minimum log level for slow log events. It defaults to TRACE, meaning all events that exceed any configured threshold get logged. Set it to INFO or WARN if you only want the most severe entries.

Indexing Slow Log Thresholds

Indexing slow logs track individual document index operations. They fire per document, not per bulk request - a bulk request indexing 1000 documents can produce 1000 slow log entries if each document exceeds the threshold. The threshold structure mirrors search:

PUT /my-index/_settings
{
  "index.indexing.slowlog.threshold.index.warn": "10s",
  "index.indexing.slowlog.threshold.index.info": "5s",
  "index.indexing.slowlog.threshold.index.debug": "2s",
  "index.indexing.slowlog.threshold.index.trace": "500ms"
}

In practice, individual document indexing rarely takes more than a few hundred milliseconds unless you have complex ingest pipelines, expensive scripted field computations, or the indexing thread pool is saturated and requests are queuing. If you see indexing slow log entries at the warn level, the root cause is often not the document itself but resource contention - merges consuming I/O, heavy search load competing for CPU, or GC pressure from oversized heaps.

You can also control how much of the document source gets logged. By default, Elasticsearch logs the first 1000 characters. Adjust it with index.indexing.slowlog.source - set it to false or 0 to disable source logging entirely, or to a higher value if you need full document visibility. The index.indexing.slowlog.reformat setting (default true) collapses the source to a single line; set it to false if you need the original JSON structure preserved.

Output Format and File Location

Slow logs are written to dedicated files, separate from the main Elasticsearch log. Search slow logs go to <cluster-name>_index_search_slowlog.json and indexing slow logs go to <cluster-name>_index_indexing_slowlog.json, both under the configured logs directory (typically /var/log/elasticsearch/ or the path.logs setting).

Since Elasticsearch 7.x, slow logs default to JSON format using the ECSJsonLayout. A typical search slow log entry looks like:

{
  "@timestamp": "2025-06-15T14:23:01.123Z",
  "log.level": "WARN",
  "log.logger": "index.search.slowlog.query",
  "elasticsearch.slowlog.took": "8.2s",
  "elasticsearch.slowlog.took_millis": 8200,
  "elasticsearch.slowlog.total_shards": 5,
  "elasticsearch.slowlog.source": "{\"query\":{\"match_all\":{}},\"size\":10000}",
  "elasticsearch.slowlog.search_type": "QUERY_THEN_FETCH",
  "elasticsearch.slowlog.total_hits": "1520432 hits"
}

The source field contains the actual query body, which is what you need for troubleshooting. If your slow log files are growing large, ship them to a separate monitoring cluster or log aggregation system rather than disabling them.

Setting Thresholds Dynamically

All slow log settings are dynamic index settings - you can change them on a running cluster without restarts. This makes them practical for targeted debugging sessions. The typical workflow is:

# Enable aggressive thresholds on a specific index
PUT /problematic-index/_settings
{
  "index.search.slowlog.threshold.query.debug": "200ms",
  "index.search.slowlog.threshold.fetch.debug": "100ms"
}

# Investigate, then disable when done
PUT /problematic-index/_settings
{
  "index.search.slowlog.threshold.query.debug": "-1",
  "index.search.slowlog.threshold.fetch.debug": "-1"
}

You can also set thresholds on index templates so all new indices inherit them. For production clusters, a reasonable starting point is warn at 5-10s and info at 2-5s for the query phase, with fetch thresholds roughly 5x lower since fetch should be fast. Avoid setting trace-level thresholds to low values like 0ms on high-throughput indices - you will generate enormous log volumes and potentially impact cluster performance through I/O pressure on the log filesystem.

To enable slow logs across all existing indices at once, use a wildcard:

PUT /*/_settings
{
  "index.search.slowlog.threshold.query.warn": "10s"
}

Correlating Slow Logs with Performance Issues

Slow logs are most useful when combined with other signals. A spike in slow log entries at the warn level that correlates with increased search latency in your monitoring dashboards points to a systemic issue - possibly GC pauses, disk I/O saturation, or hot nodes with uneven shard distribution. Isolated slow log entries for specific query patterns point to query-level problems - unbounded aggregations, deeply nested queries, or large size parameters pulling excessive data.

Cross-reference the source field with your application code to find which queries are responsible. The took_millis value tells you per-shard time; multiply by shard count for a rough upper bound on total query cost. If the same query shape appears repeatedly, the fix is usually at the application or mapping level - adding filters, switching from match_all to more selective queries, or restructuring mappings to avoid wildcard queries on high-cardinality text fields.

For indexing slow logs, correlate with merge activity and refresh metrics. A sudden increase in indexing latency alongside high merge rates suggests I/O saturation from background merges. The slow log confirms the symptom but the fix is to reduce merge pressure through larger refresh intervals, fewer shards, or faster storage.