Elasticsearch Slow Queries Diagnose

Slow queries in Elasticsearch impact user experience and can overwhelm cluster resources. This guide provides systematic approaches to identify, analyze, and fix slow-running queries.

Identifying Slow Queries

Enable Slow Query Logging

Configure slow query thresholds at the index level:

PUT /my-index/_settings
{
  "index.search.slowlog.threshold.query.warn": "10s",
  "index.search.slowlog.threshold.query.info": "5s",
  "index.search.slowlog.threshold.query.debug": "2s",
  "index.search.slowlog.threshold.query.trace": "500ms",
  "index.search.slowlog.threshold.fetch.warn": "1s",
  "index.search.slowlog.threshold.fetch.info": "800ms",
  "index.search.slowlog.threshold.fetch.debug": "500ms",
  "index.search.slowlog.threshold.fetch.trace": "200ms"
}

Check Slow Logs

Slow logs are written to separate log files:

# Location varies by installation
cat /var/log/elasticsearch/*_index_search_slowlog.log

Use the Tasks API

Find currently running slow queries:

GET /_tasks?actions=*search*&detailed=true

Monitor with Cat APIs

GET /_cat/nodes?v&h=name,search.query_current,search.query_time,search.query_total

Analyzing Query Performance

Profile API

The Profile API provides detailed timing breakdown:

GET /my-index/_search
{
  "profile": true,
  "query": {
    "match": {
      "title": "elasticsearch"
    }
  }
}

Understanding Profile Output

{
  "shards": [{
    "searches": [{
      "query": [{
        "type": "BooleanQuery",
        "description": "title:elasticsearch",
        "time_in_nanos": 1234567,
        "breakdown": {
          "score": 123456,
          "build_scorer_count": 1,
          "match_count": 100,
          "create_weight": 12345,
          "next_doc": 234567,
          "advance": 0
        }
      }]
    }]
  }]
}

Key metrics:

time_in_nanos: Total query time
score: Time spent scoring documents
next_doc: Time iterating through matches
advance: Time skipping documents

Explain API

Understand why a specific document was matched:

GET /my-index/_explain/doc_id
{
  "query": {
    "match": {
      "title": "elasticsearch"
    }
  }
}

Common Causes of Slow Queries

1. Leading Wildcards

Problem: *term wildcards scan all terms in the index.

// Slow
{
  "query": {
    "wildcard": {
      "name": "*smith"
    }
  }
}

Solution: Use reverse token filter or ngrams:

PUT /my-index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "reverse_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase", "reverse"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "fields": {
          "reverse": {
            "type": "text",
            "analyzer": "reverse_analyzer"
          }
        }
      }
    }
  }
}

2. Deep Pagination

Problem: High from values retrieve and discard many documents.

// Slow - retrieves 10000 docs, discards 9990
{
  "from": 9990,
  "size": 10,
  "query": { ... }
}

Solution: Use search_after:

{
  "size": 10,
  "query": { ... },
  "search_after": [1463538857, "doc_id_123"],
  "sort": [
    {"date": "asc"},
    {"_id": "asc"}
  ]
}

3. Script Queries

Problem: Scripts execute for every document.

// Slow
{
  "query": {
    "script_score": {
      "query": {"match_all": {}},
      "script": {
        "source": "doc['price'].value * doc['quantity'].value"
      }
    }
  }
}

Solution: Pre-compute values during indexing:

// Add computed field during indexing
{
  "price": 10,
  "quantity": 5,
  "total_value": 50  // Pre-computed
}

4. Aggregations on High-Cardinality Fields

Problem: Terms aggregations on high-cardinality fields use significant memory.

// Slow if user_id has millions of unique values
{
  "aggs": {
    "users": {
      "terms": {
        "field": "user_id",
        "size": 10000
      }
    }
  }
}

Solution:

Reduce size parameter
Use composite aggregation
Use sampler aggregation

5. Queries Without Filters

Problem: Queries that score every document are expensive.

// Slow - scores all documents
{
  "query": {
    "match": {
      "description": "search term"
    }
  }
}

Solution: Add filters for non-scoring clauses:

{
  "query": {
    "bool": {
      "must": {
        "match": {"description": "search term"}
      },
      "filter": [
        {"term": {"status": "published"}},
        {"range": {"date": {"gte": "2024-01-01"}}}
      ]
    }
  }
}

6. Too Many Clauses

Problem: Bool queries with hundreds of terms.

// Slow
{
  "query": {
    "bool": {
      "should": [
        {"term": {"tag": "tag1"}},
        {"term": {"tag": "tag2"}},
        // ... 500 more terms
      ]
    }
  }
}

Solution: Use terms query:

{
  "query": {
    "terms": {
      "tag": ["tag1", "tag2", ..., "tag500"]
    }
  }
}

Query Optimization Techniques

Use Filters for Exact Matches

// Filter context - cached, no scoring
{
  "query": {
    "bool": {
      "filter": {
        "term": {"status": "active"}
      }
    }
  }
}

Limit Result Size

{
  "size": 10,  // Only what you need
  "query": { ... }
}

Use Source Filtering

{
  "_source": ["title", "date"],  // Only fields you need
  "query": { ... }
}

Set Query Timeout

{
  "timeout": "10s",
  "query": { ... }
}

Use Query Caching

Ensure filter queries can be cached:

PUT /my-index/_settings
{
  "index.requests.cache.enable": true
}

Monitoring and Prevention

Track Query Performance

Set up monitoring for:

Average query latency
95th percentile query latency
Queries per second
Slow query count

Implement Query Governance

Set default timeouts
Limit maximum result sizes
Validate queries before execution
Rate limit expensive operations

Regular Review

Check slow logs weekly
Profile top 10 slowest queries
Update indexes based on query patterns