The Extended Stats Bucket Aggregation is a sibling pipeline aggregation that calculates extended statistics over numeric values extracted from the child buckets of a specified metric in a parent bucket aggregation. It provides a comprehensive set of statistical measures, including count, min, max, avg, sum, sum_of_squares, variance, std_deviation, and std_deviation_bounds.
Syntax
{
  "extended_stats_bucket": {
    "buckets_path": "string"
  }
}
For detailed syntax and options, refer to the official Elasticsearch documentation.
Example Usage
{
  "aggs": {
    "sales_per_month": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "month"
      },
      "aggs": {
        "sales": {
          "sum": {
            "field": "price"
          }
        }
      }
    },
    "sales_stats": {
      "extended_stats_bucket": {
        "buckets_path": "sales_per_month>sales"
      }
    }
  }
}
This example calculates extended statistics for monthly sales.
Common Issues
- Incorrect 
buckets_path: Ensure the path correctly points to the metric in the parent aggregation. - Non-numeric data: The aggregation works only on numeric values.
 - Empty buckets: Consider how to handle buckets with no data.
 
Best Practices
- Use 
extended_stats_bucketwhen you need a comprehensive statistical overview. - Combine with other aggregations for more complex analyses.
 - Consider using 
gap_policyto handle missing data points. - Be mindful of performance impact on large datasets.
 
Frequently Asked Questions
Q: How does Extended Stats Bucket Aggregation differ from regular Stats Aggregation? 
A: Extended Stats Bucket Aggregation is a pipeline aggregation that operates on the results of other aggregations, while regular Stats Aggregation works directly on document fields. Extended Stats also provides additional metrics like sum_of_squares and std_deviation_bounds.
Q: Can I use Extended Stats Bucket Aggregation with non-numeric data? 
A: No, Extended Stats Bucket Aggregation only works with numeric data. Attempting to use it with non-numeric data will result in an error.
Q: How can I handle missing values in Extended Stats Bucket Aggregation? 
A: You can use the gap_policy parameter to specify how to handle missing values. Options include "skip" (default), "insert_zeros", or using a custom value.
Q: Is there a performance impact when using Extended Stats Bucket Aggregation? 
A: While generally efficient, Extended Stats Bucket Aggregation can impact performance on very large datasets or when used in complex nested aggregations. Monitor your cluster's performance and optimize as needed.
Q: Can Extended Stats Bucket Aggregation be used in combination with other aggregations? 
A: Yes, it's often used in combination with other aggregations like date_histogram or terms aggregations to provide statistical insights across different dimensions of your data.