Optimizing Elasticsearch search queries is essential for maintaining fast response times and efficient resource utilization. This guide covers proven techniques to improve query performance.
Query Optimization Fundamentals
Use Query vs Filter Context
Query context: Calculates relevance scores (slower) Filter context: Boolean match only, cacheable (faster)
// Optimized: Use filter for exact matches
{
"query": {
"bool": {
"must": {
"match": {"title": "search term"}
},
"filter": [
{"term": {"status": "published"}},
{"range": {"date": {"gte": "2024-01-01"}}}
]
}
}
}
Avoid Common Anti-Patterns
| Anti-Pattern | Problem | Better Alternative |
|---|---|---|
Leading wildcards (*term) |
Full index scan | Reverse token filter, ngrams |
Deep pagination (from: 10000) |
Loads all preceding docs | search_after |
| Script scoring | Per-document execution | Pre-compute during indexing |
| Aggregations on text fields | Fielddata memory overhead | Use keyword type |
Specific Optimization Techniques
1. Optimize Wildcards
Problem:
// Slow - scans all terms
{"wildcard": {"name": "*smith"}}
Solution: Use edge ngrams or reverse tokens:
PUT /optimized_index
{
"settings": {
"analysis": {
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 10
}
},
"analyzer": {
"edge_ngram_analyzer": {
"tokenizer": "edge_ngram_tokenizer",
"filter": ["lowercase"]
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"fields": {
"ngram": {
"type": "text",
"analyzer": "edge_ngram_analyzer"
}
}
}
}
}
}
2. Optimize Pagination
Problem:
// Slow for deep pages
{"from": 10000, "size": 10}
Solution: Use search_after:
// First query
{
"size": 10,
"query": {"match_all": {}},
"sort": [
{"date": "desc"},
{"_id": "asc"}
]
}
// Subsequent queries
{
"size": 10,
"query": {"match_all": {}},
"sort": [
{"date": "desc"},
{"_id": "asc"}
],
"search_after": ["2024-01-15T10:30:00", "abc123"]
}
3. Optimize Aggregations
Reduce bucket count:
{
"aggs": {
"categories": {
"terms": {
"field": "category",
"size": 20, // Only what you need
"shard_size": 100 // Balance accuracy vs performance
}
}
}
}
Use composite for high-cardinality:
{
"aggs": {
"my_buckets": {
"composite": {
"size": 100,
"sources": [
{"category": {"terms": {"field": "category"}}}
]
}
}
}
}
4. Optimize Range Queries
Use date math:
{
"query": {
"range": {
"timestamp": {
"gte": "now-1d/d", // Rounded = more cacheable
"lt": "now/d"
}
}
}
}
5. Limit Return Fields
{
"_source": ["title", "date", "author"], // Only needed fields
"query": {"match": {"title": "elasticsearch"}}
}
Or exclude heavy fields:
{
"_source": {
"excludes": ["content", "attachments"]
},
"query": {"match": {"title": "elasticsearch"}}
}
6. Use Stored Fields for Specific Fields
PUT /my_index
{
"mappings": {
"properties": {
"title": {
"type": "text",
"store": true // Separately retrievable
}
}
}
}
// Retrieve only stored fields
{
"stored_fields": ["title"],
"query": {"match_all": {}}
}
Query Patterns to Avoid
Avoid match_all with Large Size
// Bad - loads everything
{
"size": 10000,
"query": {"match_all": {}}
}
// Better - use scroll or search_after
POST /my_index/_search?scroll=1m
{
"size": 1000,
"query": {"match_all": {}}
}
Avoid Regex When Possible
// Slow
{"regexp": {"path": ".*error.*"}}
// Faster if you can use wildcard
{"wildcard": {"path": "*error*"}}
// Best - restructure data for term queries
{"term": {"contains_error": true}}
Avoid Nested Queries on Large Documents
// Slow if many nested docs
{
"query": {
"nested": {
"path": "comments",
"query": {"match": {"comments.text": "great"}}
}
}
}
// Consider denormalizing for frequently queried data
Index-Level Optimizations
Configure Index for Search
PUT /search-optimized/_settings
{
"index.refresh_interval": "5s",
"index.number_of_replicas": 2, // More replicas = more search capacity
"index.search.idle.after": "30s",
"index.queries.cache.enabled": true
}
Pre-warm Queries
For critical queries, use search template warming:
PUT _scripts/my_search_template
{
"script": {
"lang": "mustache",
"source": {
"query": {
"match": {
"": ""
}
}
}
}
}
Enable Request Cache
{
"request_cache": true,
"query": {"match_all": {}}
}
Query Profiling
Use Profile API
{
"profile": true,
"query": {
"match": {"title": "elasticsearch"}
}
}
Analyze Results
Look for:
- High
time_in_nanosvalues - Unexpected query rewrites
- Expensive operations (regex, wildcards)
Use Explain API
GET /my-index/_explain/doc_id
{
"query": {
"match": {"title": "elasticsearch"}
}
}
Performance Checklist
Before deploying queries:
- Using filter context for non-scoring clauses
- No leading wildcards
- No deep pagination (use search_after)
- Aggregation sizes limited
- Only required fields in _source
- Timeout configured
- Profile API run on complex queries
- Tested with production data volume
Monitoring Query Performance
Enable Slow Logs
PUT /my-index/_settings
{
"index.search.slowlog.threshold.query.warn": "5s",
"index.search.slowlog.threshold.query.info": "2s"
}
Track Metrics
- Search latency (p50, p95, p99)
- Search rate
- Cache hit ratio
- Query rejections