The Extended Stats Aggregation is a multi-value metrics aggregation that provides a comprehensive set of statistical measures for numeric fields. It extends the basic stats aggregation by including additional metrics such as variance, standard deviation, and sum of squares.
Syntax
{
"aggs": {
"extended_stats_agg": {
"extended_stats": {
"field": "field_name"
}
}
}
}
For more details, refer to the official Elasticsearch documentation.
Example Usage
GET /my_index/_search
{
"size": 0,
"aggs": {
"price_stats": {
"extended_stats": {
"field": "price"
}
}
}
}
This query will return extended statistical information about the "price" field, including count, min, max, avg, sum, sum_of_squares, variance, and std_deviation.
Common Issues
- Null values: By default, null values are ignored. Use
"missing": 0
to treat null values as zeros. - Non-numeric fields: Ensure the field is mapped as a numeric type.
- Performance: On large datasets, this aggregation can be computationally expensive.
Best Practices
- Use
"sigma"
parameter to calculate standard deviation bounds. - Combine with other aggregations for more complex analysis.
- Consider using
stats
aggregation if you don't need the extended metrics.
Frequently Asked Questions
Q: How does Extended Stats Aggregation differ from regular Stats Aggregation?
A: Extended Stats Aggregation provides additional metrics like variance, standard deviation, and sum of squares, which are not available in the regular Stats Aggregation.
Q: Can I use Extended Stats Aggregation on non-numeric fields?
A: No, Extended Stats Aggregation is designed for numeric fields only. Attempting to use it on non-numeric fields will result in an error.
Q: How can I handle outliers in my data when using Extended Stats Aggregation?
A: You can use the "sigma" parameter to calculate standard deviation bounds, which can help identify outliers. Alternatively, consider using a filter or range aggregation before applying the extended stats.
Q: Is it possible to run Extended Stats Aggregation on multiple fields simultaneously?
A: While you can't run it on multiple fields in a single aggregation, you can nest multiple Extended Stats Aggregations within your query, each targeting a different field.
Q: How does Extended Stats Aggregation handle missing values?
A: By default, missing values are ignored. You can use the "missing" parameter to specify a default value for missing fields, e.g., "missing": 0
to treat missing values as zeros.