The Percentiles Bucket Aggregation is a sibling pipeline aggregation that calculates percentiles across all bucket of a specified metric in a parent multi-bucket aggregation. It provides insights into the distribution of values across buckets, allowing for advanced data analysis and outlier detection.
Syntax
{
"percentiles_bucket": {
"buckets_path": "string",
"percents": [number],
"format": "string",
"keyed": boolean
}
}
For detailed information, refer to the official Elasticsearch documentation.
Example Usage
{
"aggs": {
"sales_per_month": {
"date_histogram": {
"field": "date",
"calendar_interval": "month"
},
"aggs": {
"sales": {
"sum": {
"field": "price"
}
}
}
},
"percentiles_monthly_sales": {
"percentiles_bucket": {
"buckets_path": "sales_per_month>sales",
"percents": [25, 50, 75]
}
}
}
}
This example calculates the 25th, 50th, and 75th percentiles of monthly sales across all buckets.
Common Issues
- Incorrect
buckets_path
: Ensure the path correctly points to the metric in the parent aggregation. - Missing data: Percentiles calculation may be affected by buckets with missing or null values.
- Performance impact: Computing percentiles across a large number of buckets can be resource-intensive.
Best Practices
- Use meaningful percentile values based on your data distribution and analysis needs.
- Consider using the
keyed
parameter for more readable output. - Combine with other aggregations for comprehensive analysis.
- Monitor performance when used on large datasets or with many buckets.
Frequently Asked Questions
Q: How does the Percentiles Bucket Aggregation differ from the regular Percentiles Aggregation?
A: The Percentiles Bucket Aggregation calculates percentiles across buckets of a parent aggregation, while the regular Percentiles Aggregation computes percentiles within a single bucket of documents.
Q: Can I use custom percentile values?
A: Yes, you can specify custom percentile values using the percents
parameter. For example, "percents": [10, 30, 70, 90]
.
Q: How does the aggregation handle missing values?
A: By default, buckets with missing values are ignored. You can use the gap_policy
parameter to control how missing values are handled.
Q: Is it possible to format the output of Percentiles Bucket Aggregation?
A: Yes, you can use the format
parameter to specify a format string for the output values, such as "format": "0.00%"
for percentage representation.
Q: Can Percentiles Bucket Aggregation be used with nested aggregations?
A: Yes, it can be used with nested aggregations by specifying the correct buckets_path
to reach the desired metric within the nested structure.