Elasticsearch Stats Bucket Aggregation - Syntax, Example, and Tips

The Stats Bucket Aggregation in Elasticsearch is a sibling pipeline aggregation that calculates a variety of stats across all bucket of a specified metric in a parent multi-bucket aggregation. It provides statistical information such as min, max, sum, count, and avg values for the specified metric across all buckets.

Syntax

{
  "stats_bucket_agg_name": {
    "stats_bucket": {
      "buckets_path": "path_to_metric"
    }
  }
}

For more details, refer to the official Elasticsearch documentation.

Example Usage

{
  "aggs": {
    "sales_per_month": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "month"
      },
      "aggs": {
        "sales": {
          "sum": {
            "field": "price"
          }
        }
      }
    },
    "stats_monthly_sales": {
      "stats_bucket": {
        "buckets_path": "sales_per_month>sales"
      }
    }
  }
}

This example calculates stats (min, max, sum, count, avg) for monthly sales across all buckets.

Common Issues

Incorrect buckets_path: Ensure the path to the metric is correctly specified.
Missing parent aggregation: The stats bucket aggregation requires a parent multi-bucket aggregation.
Non-numeric metric: The specified metric must be numeric for stats calculation.

Best Practices

Use meaningful names for your aggregations to improve readability.
Combine with other aggregations for more complex analysis.
Consider using extended_stats_bucket for additional statistical measures like standard deviation.

Frequently Asked Questions

Q: Can I use stats bucket aggregation on non-numeric fields?
A: No, stats bucket aggregation only works on numeric metrics. For non-numeric fields, consider using other appropriate aggregations.

Q: How does stats bucket aggregation differ from regular stats aggregation?
A: Stats bucket aggregation operates on buckets from a parent aggregation, while regular stats aggregation works directly on document fields.

Q: Can I exclude certain statistics from the result?
A: The basic stats bucket aggregation doesn't allow excluding specific stats. For more control, consider using separate metric aggregations.

Q: Is it possible to use stats bucket aggregation with nested aggregations?
A: Yes, you can use stats bucket aggregation with nested aggregations by properly specifying the buckets_path.

Q: How does stats bucket aggregation handle null or missing values?
A: By default, buckets with null or missing values are ignored in the calculation. You can modify this behavior using the gap_policy parameter.