Elasticsearch Slow Query Logs How to Read

Elasticsearch slow query logs help identify queries that take longer than expected. Understanding how to configure and interpret these logs is essential for query optimization and performance troubleshooting.

Configuring Slow Query Logs

Enable Slow Logging

Set thresholds per index:

PUT /my-index/_settings
{
  "index.search.slowlog.threshold.query.warn": "10s",
  "index.search.slowlog.threshold.query.info": "5s",
  "index.search.slowlog.threshold.query.debug": "2s",
  "index.search.slowlog.threshold.query.trace": "500ms",

  "index.search.slowlog.threshold.fetch.warn": "1s",
  "index.search.slowlog.threshold.fetch.info": "800ms",
  "index.search.slowlog.threshold.fetch.debug": "500ms",
  "index.search.slowlog.threshold.fetch.trace": "200ms"
}

Configure Logging Level

In log4j2.properties:

logger.index_search_slowlog.name = index.search.slowlog
logger.index_search_slowlog.level = trace
logger.index_search_slowlog.appenderRef.index_search_slowlog_rolling.ref = index_search_slowlog_rolling
logger.index_search_slowlog.additivity = false

Log File Location

Default locations:

Linux: /var/log/elasticsearch/{cluster}_index_search_slowlog.log
Docker: /usr/share/elasticsearch/logs/

Understanding Slow Log Format

Sample Log Entry

[2024-01-15T10:30:45,123][WARN ][index.search.slowlog.query] [node-1]
[my-index][0] took[15.2s], took_millis[15234], total_hits[1250],
types[], stats[], search_type[QUERY_THEN_FETCH], total_shards[5],
source[{"query":{"bool":{"must":[{"match":{"title":"elasticsearch"}}],
"filter":[{"range":{"date":{"gte":"2024-01-01"}}}]}},"size":10}],
id[]

Log Entry Components

Component	Description
`[2024-01-15T10:30:45,123]`	Timestamp
`[WARN]`	Log level (matches threshold)
`[index.search.slowlog.query]`	Log type (query or fetch)
`[node-1]`	Node that executed
`[my-index][0]`	Index name and shard number
`took[15.2s]`	Human-readable duration
`took_millis[15234]`	Duration in milliseconds
`total_hits[1250]`	Number of matching documents
`search_type[...]`	Search type used
`total_shards[5]`	Shards queried
`source[...]`	The actual query JSON

Analyzing Slow Logs

Extract Common Patterns

# Find most common slow queries by pattern
grep "slowlog.query" elasticsearch_index_search_slowlog.log | \
  grep -oP 'source\[\K[^\]]+' | \
  sort | uniq -c | sort -rn | head -20

Find Worst Offenders

# Find queries taking longest
grep "slowlog.query" elasticsearch_index_search_slowlog.log | \
  grep -oP 'took_millis\[\K[0-9]+' | \
  sort -rn | head -10

# Get full log entries for slowest queries
grep "took_millis\[1[0-9]\{4,\}\]" elasticsearch_index_search_slowlog.log

Analyze by Index

# Count slow queries per index
grep "slowlog.query" elasticsearch_index_search_slowlog.log | \
  grep -oP '\]\[\K[^]]+(?=\]\[)' | \
  sort | uniq -c | sort -rn

Time Distribution

# Slow queries by hour
grep "slowlog.query" elasticsearch_index_search_slowlog.log | \
  grep -oP '^\[\K[^,]+' | \
  cut -d'T' -f2 | cut -d':' -f1 | \
  sort | uniq -c

Query vs Fetch Phase

Query Phase Slow Logs

index.search.slowlog.query
Time spent finding matching documents
Scored and sorted document IDs

Common issues:

Complex queries (regex, wildcards)
Large number of matching documents
Expensive scoring

Fetch Phase Slow Logs

index.search.slowlog.fetch
Time spent retrieving document content
Loading _source and stored fields

Common issues:

Large documents
Many fields requested
Highlighting on large text fields

Identifying Problematic Queries

Signs of Bad Queries

Consistent slow timing: Same query pattern always slow
Wildcard/Regex patterns: Leading wildcards are expensive
Large aggregations: High cardinality, many buckets
Deep pagination: High from values
Script queries: Per-document execution

Query Patterns to Look For

# Find wildcard queries
grep "slowlog" *.log | grep '"wildcard"'

# Find regex queries
grep "slowlog" *.log | grep '"regexp"'

# Find deep pagination
grep "slowlog" *.log | grep '"from":[0-9]\{4,\}'

# Find script queries
grep "slowlog" *.log | grep '"script"'

Creating Actionable Reports

Daily Slow Query Summary

#!/bin/bash
LOG_FILE="/var/log/elasticsearch/*_index_search_slowlog.log"
DATE=$(date +%Y-%m-%d)

echo "Slow Query Report for $DATE"
echo "=============================="

echo -e "\nTotal slow queries:"
grep "$DATE" $LOG_FILE | wc -l

echo -e "\nBy severity:"
grep "$DATE" $LOG_FILE | grep -oP '\]\[\K(WARN|INFO|DEBUG|TRACE)' | sort | uniq -c

echo -e "\nTop 10 slowest (milliseconds):"
grep "$DATE" $LOG_FILE | grep -oP 'took_millis\[\K[0-9]+' | sort -rn | head -10

echo -e "\nAffected indices:"
grep "$DATE" $LOG_FILE | grep -oP '\]\[\K[^]]+(?=\]\[)' | sort | uniq -c | sort -rn | head -10

Integration with Monitoring

Ship slow logs to your monitoring stack:

# filebeat.yml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/elasticsearch/*_index_search_slowlog.log
  fields:
    log_type: elasticsearch_slowlog

Threshold Recommendations

Development/Testing

{
  "index.search.slowlog.threshold.query.warn": "5s",
  "index.search.slowlog.threshold.query.info": "2s",
  "index.search.slowlog.threshold.query.debug": "1s",
  "index.search.slowlog.threshold.query.trace": "500ms"
}

Production

{
  "index.search.slowlog.threshold.query.warn": "10s",
  "index.search.slowlog.threshold.query.info": "5s",
  "index.search.slowlog.threshold.query.debug": "2s"
}

High-Performance Requirements

{
  "index.search.slowlog.threshold.query.warn": "1s",
  "index.search.slowlog.threshold.query.info": "500ms",
  "index.search.slowlog.threshold.query.debug": "200ms"
}

Troubleshooting from Slow Logs

Workflow

Identify: Find slow query in logs
Extract: Get the query source
Profile: Run with profile API
Analyze: Find expensive operations
Optimize: Apply fixes
Verify: Monitor slow logs for improvement

Example: Extracting and Profiling

# Extract query from log
QUERY=$(grep "took_millis\[15234\]" slowlog.log | grep -oP 'source\[\K[^\]]+')

# Profile it
curl -X GET "localhost:9200/my-index/_search?pretty" -H 'Content-Type: application/json' -d"
{
  \"profile\": true,
  $QUERY
}
"