The Sum Aggregation (sum
) is a single-value metrics aggregation that calculates the sum of numeric values extracted from the aggregated documents.
Syntax
{
"aggs": {
"total_sales": {
"sum": {
"field": "sales"
}
}
}
}
For more details, refer to the official Elasticsearch documentation on Sum Aggregation.
Example Usage
GET /sales/_search
{
"size": 0,
"aggs": {
"total_revenue": {
"sum": {
"field": "price"
}
}
}
}
This query calculates the sum of all "price" field values across documents in the "sales" index.
Common Issues
- Non-numeric fields: Ensure the field you're summing contains numeric values.
- Missing values: By default, documents without the specified field are ignored. Use
"missing": 0
to include them with a default value. - Precision issues: For high-precision calculations, consider using scaled_float or double data types.
Best Practices
- Use the
_value_count
metric alongside sum to calculate averages accurately. - For large datasets, consider using sampling or date ranges to improve performance.
- Combine with other aggregations like terms or date_histogram for more insightful analytics.
Frequently Asked Questions
Q: Can Sum Aggregation be used on nested fields?
A: Yes, you can use Sum Aggregation on nested fields by wrapping it in a nested aggregation.
Q: How does Sum Aggregation handle NaN or infinity values?
A: NaN and infinity values are ignored by the Sum Aggregation.
Q: Is there a way to get the count of documents contributing to the sum?
A: Yes, you can use a value_count
aggregation alongside the sum aggregation to get this information.
Q: Can Sum Aggregation be used in a script?
A: Yes, you can use scripting with Sum Aggregation to perform more complex calculations.
Q: How does Sum Aggregation perform on large datasets?
A: Sum Aggregation is generally efficient, but for very large datasets, consider using date ranges or sampling to improve performance if exact precision isn't required.