Elasticsearch slow query logs help identify queries that take longer than expected. Understanding how to configure and interpret these logs is essential for query optimization and performance troubleshooting.
Configuring Slow Query Logs
Enable Slow Logging
Set thresholds per index:
PUT /my-index/_settings
{
"index.search.slowlog.threshold.query.warn": "10s",
"index.search.slowlog.threshold.query.info": "5s",
"index.search.slowlog.threshold.query.debug": "2s",
"index.search.slowlog.threshold.query.trace": "500ms",
"index.search.slowlog.threshold.fetch.warn": "1s",
"index.search.slowlog.threshold.fetch.info": "800ms",
"index.search.slowlog.threshold.fetch.debug": "500ms",
"index.search.slowlog.threshold.fetch.trace": "200ms"
}
Configure Logging Level
In log4j2.properties:
logger.index_search_slowlog.name = index.search.slowlog
logger.index_search_slowlog.level = trace
logger.index_search_slowlog.appenderRef.index_search_slowlog_rolling.ref = index_search_slowlog_rolling
logger.index_search_slowlog.additivity = false
Log File Location
Default locations:
- Linux:
/var/log/elasticsearch/{cluster}_index_search_slowlog.log - Docker:
/usr/share/elasticsearch/logs/
Understanding Slow Log Format
Sample Log Entry
[2024-01-15T10:30:45,123][WARN ][index.search.slowlog.query] [node-1]
[my-index][0] took[15.2s], took_millis[15234], total_hits[1250],
types[], stats[], search_type[QUERY_THEN_FETCH], total_shards[5],
source[{"query":{"bool":{"must":[{"match":{"title":"elasticsearch"}}],
"filter":[{"range":{"date":{"gte":"2024-01-01"}}}]}},"size":10}],
id[]
Log Entry Components
| Component | Description |
|---|---|
[2024-01-15T10:30:45,123] |
Timestamp |
[WARN] |
Log level (matches threshold) |
[index.search.slowlog.query] |
Log type (query or fetch) |
[node-1] |
Node that executed |
[my-index][0] |
Index name and shard number |
took[15.2s] |
Human-readable duration |
took_millis[15234] |
Duration in milliseconds |
total_hits[1250] |
Number of matching documents |
search_type[...] |
Search type used |
total_shards[5] |
Shards queried |
source[...] |
The actual query JSON |
Analyzing Slow Logs
Extract Common Patterns
# Find most common slow queries by pattern
grep "slowlog.query" elasticsearch_index_search_slowlog.log | \
grep -oP 'source\[\K[^\]]+' | \
sort | uniq -c | sort -rn | head -20
Find Worst Offenders
# Find queries taking longest
grep "slowlog.query" elasticsearch_index_search_slowlog.log | \
grep -oP 'took_millis\[\K[0-9]+' | \
sort -rn | head -10
# Get full log entries for slowest queries
grep "took_millis\[1[0-9]\{4,\}\]" elasticsearch_index_search_slowlog.log
Analyze by Index
# Count slow queries per index
grep "slowlog.query" elasticsearch_index_search_slowlog.log | \
grep -oP '\]\[\K[^]]+(?=\]\[)' | \
sort | uniq -c | sort -rn
Time Distribution
# Slow queries by hour
grep "slowlog.query" elasticsearch_index_search_slowlog.log | \
grep -oP '^\[\K[^,]+' | \
cut -d'T' -f2 | cut -d':' -f1 | \
sort | uniq -c
Query vs Fetch Phase
Query Phase Slow Logs
index.search.slowlog.query- Time spent finding matching documents
- Scored and sorted document IDs
Common issues:
- Complex queries (regex, wildcards)
- Large number of matching documents
- Expensive scoring
Fetch Phase Slow Logs
index.search.slowlog.fetch- Time spent retrieving document content
- Loading _source and stored fields
Common issues:
- Large documents
- Many fields requested
- Highlighting on large text fields
Identifying Problematic Queries
Signs of Bad Queries
- Consistent slow timing: Same query pattern always slow
- Wildcard/Regex patterns: Leading wildcards are expensive
- Large aggregations: High cardinality, many buckets
- Deep pagination: High
fromvalues - Script queries: Per-document execution
Query Patterns to Look For
# Find wildcard queries
grep "slowlog" *.log | grep '"wildcard"'
# Find regex queries
grep "slowlog" *.log | grep '"regexp"'
# Find deep pagination
grep "slowlog" *.log | grep '"from":[0-9]\{4,\}'
# Find script queries
grep "slowlog" *.log | grep '"script"'
Creating Actionable Reports
Daily Slow Query Summary
#!/bin/bash
LOG_FILE="/var/log/elasticsearch/*_index_search_slowlog.log"
DATE=$(date +%Y-%m-%d)
echo "Slow Query Report for $DATE"
echo "=============================="
echo -e "\nTotal slow queries:"
grep "$DATE" $LOG_FILE | wc -l
echo -e "\nBy severity:"
grep "$DATE" $LOG_FILE | grep -oP '\]\[\K(WARN|INFO|DEBUG|TRACE)' | sort | uniq -c
echo -e "\nTop 10 slowest (milliseconds):"
grep "$DATE" $LOG_FILE | grep -oP 'took_millis\[\K[0-9]+' | sort -rn | head -10
echo -e "\nAffected indices:"
grep "$DATE" $LOG_FILE | grep -oP '\]\[\K[^]]+(?=\]\[)' | sort | uniq -c | sort -rn | head -10
Integration with Monitoring
Ship slow logs to your monitoring stack:
# filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/elasticsearch/*_index_search_slowlog.log
fields:
log_type: elasticsearch_slowlog
Threshold Recommendations
Development/Testing
{
"index.search.slowlog.threshold.query.warn": "5s",
"index.search.slowlog.threshold.query.info": "2s",
"index.search.slowlog.threshold.query.debug": "1s",
"index.search.slowlog.threshold.query.trace": "500ms"
}
Production
{
"index.search.slowlog.threshold.query.warn": "10s",
"index.search.slowlog.threshold.query.info": "5s",
"index.search.slowlog.threshold.query.debug": "2s"
}
High-Performance Requirements
{
"index.search.slowlog.threshold.query.warn": "1s",
"index.search.slowlog.threshold.query.info": "500ms",
"index.search.slowlog.threshold.query.debug": "200ms"
}
Troubleshooting from Slow Logs
Workflow
- Identify: Find slow query in logs
- Extract: Get the query source
- Profile: Run with profile API
- Analyze: Find expensive operations
- Optimize: Apply fixes
- Verify: Monitor slow logs for improvement
Example: Extracting and Profiling
# Extract query from log
QUERY=$(grep "took_millis\[15234\]" slowlog.log | grep -oP 'source\[\K[^\]]+')
# Profile it
curl -X GET "localhost:9200/my-index/_search?pretty" -H 'Content-Type: application/json' -d"
{
\"profile\": true,
$QUERY
}
"