Elasticsearch Composite Aggregation - Syntax, Example, and Tips

The Composite Aggregation is a multi-bucket aggregation that creates composite buckets from different sources. It allows for efficient pagination through all buckets without the need for a secondary client-side sort.

Syntax

{
  "composite": {
    "size": 10,
    "sources": [
      { "SOURCE_1": { SOURCE_1_DEFINITION } },
      { "SOURCE_2": { SOURCE_2_DEFINITION } },
      ...
    ]
  }
}

For detailed syntax and options, refer to the official Elasticsearch documentation.

Example Usage

GET /sales/_search
{
  "size": 0,
  "aggs": {
    "sales_by_date_product": {
      "composite": {
        "size": 10,
        "sources": [
          { "date": { "date_histogram": { "field": "date", "calendar_interval": "1d" } } },
          { "product": { "terms": { "field": "product" } } }
        ]
      }
    }
  }
}

This example creates composite buckets based on a date histogram and product terms.

Common Issues

Exceeding the maximum number of buckets (default: 10,000)
Slow performance with large datasets
Incorrect ordering of sources affecting pagination

Best Practices

Use appropriate size parameter to control the number of buckets returned
Order sources from least cardinal to most cardinal for better performance
Utilize the after parameter for efficient pagination
Consider using missing_bucket: true to include documents with missing values

Frequently Asked Questions

Q: How does the Composite Aggregation differ from other multi-bucket aggregations?
A: The Composite Aggregation allows for efficient pagination through all buckets without the need for a secondary client-side sort, making it ideal for processing large datasets.

Q: Can I use script-based sources in a Composite Aggregation?
A: Yes, you can use script-based sources in addition to field-based sources in a Composite Aggregation.

Q: How can I implement pagination with Composite Aggregation?
A: Use the after parameter with the values from the last bucket of the previous page to fetch the next page of results.

Q: Is there a limit to the number of sources I can use in a Composite Aggregation?
A: While there's no hard limit, it's recommended to keep the number of sources reasonable (typically under 10) for performance reasons.

Q: Can Composite Aggregation be used with nested fields?
A: Yes, Composite Aggregation can be used with nested fields by specifying the nested path in the source definition.