The Elasticsearch sum aggregation is a single-value metric aggregation that adds up numeric values extracted from the documents in a bucket and returns the total. It works on any numeric field type (long, integer, double, float, scaled_float, half_float) and supports scripted values. It is the workhorse behind totals like revenue per category, bytes per host, and events per interval.
Syntax
GET /sales/_search
{
"size": 0,
"aggs": {
"total_revenue": {
"sum": {
"field": "amount",
"missing": 0
}
}
}
}
Result lives at aggregations.total_revenue.value. The value is always a double.
Parameters
| Parameter | Default | Description |
|---|---|---|
field |
required (or script) |
Numeric field whose values are summed. |
missing |
- | Substitute used when the field is absent. Without it, missing docs contribute nothing. |
script |
- | Use a Painless script. The script's emitted numeric value(s) are summed. |
format |
- | Numeric format string applied to value_as_string in the response. |
Examples
Total revenue across all documents matching a query:
GET /sales/_search
{
"size": 0,
"query": { "range": { "@timestamp": { "gte": "now-30d" } } },
"aggs": {
"revenue_30d": { "sum": { "field": "amount" } }
}
}
Total bytes per HTTP status, ordered by total descending:
"aggs": {
"by_status": {
"terms": {
"field": "status",
"size": 10,
"order": { "bytes_total": "desc" }
},
"aggs": {
"bytes_total": { "sum": { "field": "response.bytes" } }
}
}
}
Scripted sum (apply a unit conversion at query time):
"sum": {
"script": {
"source": "doc['size_bytes'].value / 1024.0"
}
}
Performance and Precision Notes
The sum aggregation reads doc values for the target field and runs in linear time over the matched documents. It is cheap per document but extremely expensive when run against billions of docs, especially when nested inside high-cardinality bucket aggregations. Combine it with the date histogram, terms, or composite aggregations and the cost multiplies.
Floating-point sums accumulate rounding error. For currency-style precision, store values as integers (cents) or as scaled_float with a fixed scaling_factor, then divide at display time. A scaled_float field with scaling_factor: 100 stores 12.34 as 1234 internally; the sum aggregation returns the un-scaled value (12.34), but addition is performed on the integer representation, avoiding the typical 0.1 + 0.2 = 0.30000000000000004 issue.
Sum aggregations over runtime fields or scripted fields skip index doc-value reads and run the script per document - orders of magnitude slower than indexed numeric fields. Pulse tracks slow aggregation patterns on Elasticsearch and OpenSearch and flags clusters where scripted sums are driving heap and CPU spikes.
Common Mistakes
- Running sum on a string field that "looks" numeric - it must be mapped numeric or
scaled_float. Mapping mismatch returns an error. - Forgetting
missingwhen computing averages fromsum / value_count- documents lacking the field skew the denominator. - Summing floats in financial dashboards and getting fractional cent drift. Use
scaled_floator integer cents. - Adding scripted unit conversions per document instead of pre-computing them at index time.
- Relying on bucket
doc_countas a sum -doc_countis a count of documents, not a total of a field.
Frequently Asked Questions
Q: Can the sum aggregation handle currency values precisely?
A: Use scaled_float with an appropriate scaling_factor (e.g. 100 for cents) or store values as integer minor units. The sum aggregation then computes on integers internally, avoiding floating-point drift.
Q: How does sum handle missing or null values?
A: Documents without the field contribute zero unless missing is set, in which case the substitute value is added once per missing document.
Q: Can I sum across nested fields?
A: Yes - wrap the sum in a nested aggregation so it iterates the inner documents rather than treating the array as a flat field on the parent.
Q: How is sum different from sum_bucket?
A: The sum aggregation totals raw field values across documents. The sum_bucket pipeline aggregation totals the output of another aggregation across its sibling buckets. Different inputs, different stages.
Q: Does sum work on unsigned_long fields?
A: Yes, with a caveat: the result type for unsigned_long sums is also unsigned_long, and values exceeding Long.MAX_VALUE are not representable. For very large counters, watch for overflow.
Q: Is the sum aggregation deterministic across shards?
A: For integer and scaled_float fields, yes. For floating-point fields, partial sums per shard are merged at the coordinator, and floating-point addition is not associative, so the last bit of the result can vary across runs.
Related Reading
- Sum Bucket Aggregation: total a metric across sibling buckets, not raw documents.
- Max Aggregation: largest value, complement to sum.
- Value Count Aggregation: count of values, denominator for an average.
- Date Histogram Aggregation: bucket time-series sums.
- Elasticsearch Query Language: the DSL sum runs inside.