Elasticsearch Extended Stats Aggregation - Syntax, Example, and Tips

Pulse - Elasticsearch Operations Done Right

On this page

Syntax Example Usage Common Issues Best Practices Frequently Asked Questions

The Extended Stats Aggregation is a multi-value metrics aggregation that provides a comprehensive set of statistical measures for numeric fields. It extends the basic stats aggregation by including additional metrics such as variance, standard deviation, and sum of squares.

Syntax

{
  "aggs": {
    "extended_stats_agg": {
      "extended_stats": {
        "field": "field_name"
      }
    }
  }
}

For more details, refer to the official Elasticsearch documentation.

Example Usage

GET /my_index/_search
{
  "size": 0,
  "aggs": {
    "price_stats": {
      "extended_stats": {
        "field": "price"
      }
    }
  }
}

This query will return extended statistical information about the "price" field, including count, min, max, avg, sum, sum_of_squares, variance, and std_deviation.

Common Issues

  1. Null values: By default, null values are ignored. Use "missing": 0 to treat null values as zeros.
  2. Non-numeric fields: Ensure the field is mapped as a numeric type.
  3. Performance: On large datasets, this aggregation can be computationally expensive.

Best Practices

  1. Use "sigma" parameter to calculate standard deviation bounds.
  2. Combine with other aggregations for more complex analysis.
  3. Consider using stats aggregation if you don't need the extended metrics.

Frequently Asked Questions

Q: How does Extended Stats Aggregation differ from regular Stats Aggregation?
A: Extended Stats Aggregation provides additional metrics like variance, standard deviation, and sum of squares, which are not available in the regular Stats Aggregation.

Q: Can I use Extended Stats Aggregation on non-numeric fields?
A: No, Extended Stats Aggregation is designed for numeric fields only. Attempting to use it on non-numeric fields will result in an error.

Q: How can I handle outliers in my data when using Extended Stats Aggregation?
A: You can use the "sigma" parameter to calculate standard deviation bounds, which can help identify outliers. Alternatively, consider using a filter or range aggregation before applying the extended stats.

Q: Is it possible to run Extended Stats Aggregation on multiple fields simultaneously?
A: While you can't run it on multiple fields in a single aggregation, you can nest multiple Extended Stats Aggregations within your query, each targeting a different field.

Q: How does Extended Stats Aggregation handle missing values?
A: By default, missing values are ignored. You can use the "missing" parameter to specify a default value for missing fields, e.g., "missing": 0 to treat missing values as zeros.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.