Meet the Pulse team at AWS re:Invent!

Read more

Elasticsearch search.max_buckets Error

The too_many_buckets_exception occurs when an aggregation attempts to create more buckets than the search.max_buckets limit allows. This guide explains how to diagnose and resolve this error.

Understanding the Error

Error Message

{
  "error": {
    "root_cause": [{
      "type": "too_many_buckets_exception",
      "reason": "Trying to create too many buckets. Must be less than or equal to: [65536] but was [100000]..."
    }]
  }
}

Why This Limit Exists

The bucket limit prevents:

  • Memory exhaustion from large aggregations
  • Extreme response times
  • Cluster instability

Default Limit

  • Elasticsearch 7.x+: 65,536 buckets (down from 10,000 in earlier versions)
  • Per-search limit, not per-aggregation

Diagnosing the Issue

Identify Problematic Aggregation

Check which aggregation is creating too many buckets:

  1. High-cardinality terms: Large size on terms aggregation
  2. Date histogram: Fine granularity over long periods
  3. Multi-level nesting: Bucket explosion from nested aggregations

Calculate Bucket Count

Total buckets = agg1_buckets × agg2_buckets × ... × aggN_buckets

Example:
- 1000 categories × 365 days × 24 hours = 8,760,000 buckets (way over limit!)

Solutions

Solution 1: Reduce Bucket Count (Recommended)

Reduce terms size:

{
  "aggs": {
    "categories": {
      "terms": {
        "field": "category.keyword",
        "size": 100  // Instead of 10000
      }
    }
  }
}

Increase date histogram interval:

{
  "aggs": {
    "over_time": {
      "date_histogram": {
        "field": "timestamp",
        "calendar_interval": "day"  // Instead of "hour" or "minute"
      }
    }
  }
}

Solution 2: Use Composite Aggregation

For paginating through all buckets:

{
  "size": 0,
  "aggs": {
    "my_buckets": {
      "composite": {
        "size": 1000,
        "sources": [
          {"category": {"terms": {"field": "category.keyword"}}},
          {"date": {"date_histogram": {"field": "timestamp", "calendar_interval": "day"}}}
        ]
      }
    }
  }
}

// Next page
{
  "size": 0,
  "aggs": {
    "my_buckets": {
      "composite": {
        "size": 1000,
        "after": {"category": "last_value", "date": 1704067200000},
        "sources": [...]
      }
    }
  }
}

Solution 3: Filter Data First

Reduce the time range or add filters:

{
  "query": {
    "bool": {
      "filter": [
        {"range": {"timestamp": {"gte": "now-7d"}}},
        {"term": {"status": "active"}}
      ]
    }
  },
  "aggs": {
    "daily": {
      "date_histogram": {
        "field": "timestamp",
        "calendar_interval": "hour"
      }
    }
  }
}

Solution 4: Use Sampler Aggregation

Get representative results from a sample:

{
  "aggs": {
    "sample": {
      "sampler": {
        "shard_size": 10000
      },
      "aggs": {
        "categories": {
          "terms": {
            "field": "category.keyword",
            "size": 100
          }
        }
      }
    }
  }
}

Solution 5: Increase the Limit (Use Carefully)

Per-request override:

{
  "size": 0,
  "aggs": {
    "my_agg": {
      "terms": {
        "field": "category.keyword",
        "size": 100000
      }
    }
  }
}
// Add to URL: ?search.max_buckets=200000

Cluster-wide setting:

PUT /_cluster/settings
{
  "persistent": {
    "search.max_buckets": 100000
  }
}

Warning: Increasing this limit can cause memory issues and slow responses.

Solution 6: Pre-Aggregate with Transforms

Create summary indices for repeated aggregations:

PUT _transform/category_daily_summary
{
  "source": {
    "index": "events"
  },
  "dest": {
    "index": "category_daily_summary"
  },
  "pivot": {
    "group_by": {
      "date": {
        "date_histogram": {
          "field": "timestamp",
          "calendar_interval": "day"
        }
      },
      "category": {
        "terms": {
          "field": "category.keyword"
        }
      }
    },
    "aggregations": {
      "count": {"value_count": {"field": "_id"}},
      "total": {"sum": {"field": "value"}}
    }
  }
}

POST _transform/category_daily_summary/_start

Optimizing Nested Aggregations

Problem: Bucket Explosion

{
  "aggs": {
    "countries": {
      "terms": {"field": "country", "size": 200},
      "aggs": {
        "cities": {
          "terms": {"field": "city", "size": 500},
          "aggs": {
            "daily": {
              "date_histogram": {
                "field": "timestamp",
                "calendar_interval": "day"
              }
            }
          }
        }
      }
    }
  }
}
// Potential: 200 × 500 × 365 = 36,500,000 buckets!

Solution: Flatten or Limit

{
  "aggs": {
    "countries": {
      "terms": {"field": "country", "size": 10},  // Limit
      "aggs": {
        "daily": {
          "date_histogram": {
            "field": "timestamp",
            "calendar_interval": "month"  // Coarser granularity
          }
        }
      }
    }
  }
}

Best Practices

Calculate Before Querying

Estimate bucket count before running aggregations:

def estimate_buckets(agg_sizes):
    """Estimate total buckets from list of aggregation sizes"""
    total = 1
    for size in agg_sizes:
        total *= size
    return total

# Example
sizes = [100, 365, 24]  # categories, days, hours
print(f"Estimated buckets: {estimate_buckets(sizes)}")  # 876,000

Progressive Drilling

Start broad, then narrow:

  1. First query: Top-level summary
  2. User clicks: More detailed view
  3. Avoid loading all detail upfront

Monitor Aggregation Sizes

Track aggregation memory usage:

GET /_nodes/stats/indices/fielddata
GET /_nodes/stats/breaker
Pulse - Elasticsearch Operations Done Right

Pulse can solve your Elasticsearch issues

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.