Meet the Pulse team at AWS re:Invent!

Read more

Elasticsearch Performance Issues Troubleshooting

Elasticsearch performance issues can manifest in various ways, from slow queries to high resource utilization. This guide provides a systematic approach to identifying and resolving common performance problems in Elasticsearch clusters.

Common Performance Issue Categories

1. Query Performance Issues

  • Slow search responses
  • High query latency
  • Timeout errors during searches

2. Indexing Performance Issues

  • Slow document indexing
  • Bulk request failures
  • High indexing latency

3. Resource Utilization Issues

  • High CPU usage
  • Memory pressure
  • Disk I/O bottlenecks
  • Network saturation

Diagnostic Steps

Step 1: Check Cluster Health

Start by verifying the overall cluster health:

GET /_cluster/health

A yellow or red status indicates underlying issues that may contribute to performance problems.

Step 2: Identify Hot Threads

Use the hot threads API to identify CPU-intensive operations:

GET /_nodes/hot_threads

This reveals which threads are consuming the most CPU time and what operations they're performing.

Step 3: Review Node Statistics

Check resource utilization across all nodes:

GET /_nodes/stats

Pay attention to:

  • JVM heap usage and garbage collection metrics
  • Thread pool queue sizes and rejections
  • Disk I/O statistics
  • Network metrics

Step 4: Analyze Slow Queries

Enable slow query logging to identify problematic queries:

PUT /my-index/_settings
{
  "index.search.slowlog.threshold.query.warn": "10s",
  "index.search.slowlog.threshold.query.info": "5s",
  "index.search.slowlog.threshold.query.debug": "2s",
  "index.search.slowlog.threshold.query.trace": "500ms"
}

Step 5: Check Pending Tasks

Review pending cluster tasks that may indicate bottlenecks:

GET /_cluster/pending_tasks

Common Causes and Solutions

Too Many Shards

Symptoms: High memory usage, slow cluster state updates, degraded search performance

Solution: Reduce shard count by:

  • Using appropriate shard sizing (10-50 GB per shard)
  • Implementing index lifecycle management (ILM)
  • Consolidating small indices

Inefficient Queries

Symptoms: Slow query responses, high CPU usage during searches

Solution:

  • Avoid wildcard queries at the beginning of terms
  • Use filters instead of queries where possible
  • Implement pagination properly (avoid deep pagination)
  • Reduce aggregation bucket sizes

Insufficient Resources

Symptoms: High resource utilization, frequent garbage collection

Solution:

  • Scale vertically (more memory, faster disks)
  • Scale horizontally (add more nodes)
  • Use SSDs instead of HDDs
  • Ensure heap is set appropriately (no more than 50% of RAM, max 32 GB)

Disk I/O Bottlenecks

Symptoms: High iowait, slow indexing and searches

Solution:

  • Use SSDs for data nodes
  • Increase the refresh interval for write-heavy workloads
  • Ensure adequate filesystem cache (50% of RAM for OS cache)

Monitoring Best Practices

  1. Set up continuous monitoring using tools like Kibana Stack Monitoring, Prometheus, or Datadog
  2. Create alerts for key metrics:
    • JVM heap usage > 85%
    • Thread pool rejections
    • Cluster status changes
    • Disk usage > 80%
  3. Establish baselines to understand normal performance patterns
  4. Monitor queue depths - ideally queues should be near empty

Performance Tuning Checklist

  • Heap size is 50% of RAM (max 32 GB)
  • Using SSDs for data storage
  • Shards sized between 10-50 GB
  • Slow query logging enabled
  • Monitoring and alerting configured
  • Index lifecycle management implemented
  • Query patterns optimized
  • Bulk indexing used for high-volume writes

Additional Resources

Pulse - Elasticsearch Operations Done Right

Pulse can solve your Elasticsearch issues

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.