Elasticsearch Multi Terms Aggregation - Syntax, Example, and Tips

This aggregation groups documents into buckets based on unique combinations of terms across multiple specified fields. It's particularly useful when you need to analyze data across multiple dimensions simultaneously.

Syntax and Documentation

{
  "aggs": {
    "my_multi_terms_agg": {
      "multi_terms": {
        "terms": [
          { "field": "field1" },
          { "field": "field2" }
        ]
      }
    }
  }
}

For detailed information, refer to the official Elasticsearch documentation on Multi Terms Aggregation.

Example Usage

Consider an e-commerce dataset where you want to analyze sales by both product category and customer location:

{
  "aggs": {
    "sales_by_category_and_location": {
      "multi_terms": {
        "terms": [
          { "field": "product.category" },
          { "field": "customer.country" }
        ]
      },
      "aggs": {
        "total_sales": {
          "sum": { "field": "sales_amount" }
        }
      }
    }
  }
}

This aggregation will create buckets for each unique combination of product category and customer country, with a nested aggregation to sum the sales amount for each bucket.

Common Issues

High cardinality: When dealing with fields that have many unique values, the aggregation can consume significant memory and processing power.
Order matters: The order of fields in the terms array affects the result structure.
Missing values: By default, documents with missing values for any of the specified fields are excluded from the aggregation.

Best Practices

Use size parameter to limit the number of returned buckets and manage resource usage.
Consider using missing parameter to handle documents with missing field values.
For high-cardinality fields, consider using filters or other aggregations to reduce the dataset before applying multi_terms.

Frequently Asked Questions

Q: How does Multi Terms Aggregation differ from nested Terms Aggregations?
A: Multi Terms Aggregation creates buckets based on unique combinations across multiple fields in a single pass, while nested Terms Aggregations would create a hierarchy of buckets, potentially leading to a different structure and less efficient execution.

Q: Can I use script fields in Multi Terms Aggregation?
A: Yes, you can use script fields in Multi Terms Aggregation. Instead of specifying a field, you can provide a script that generates the terms for bucketing.

Q: How can I control the order of buckets in the result?
A: You can use the order parameter to specify the sorting of buckets, either by count, a custom metric, or multiple criteria.

Q: Is there a limit to the number of fields I can use in a Multi Terms Aggregation?
A: While there's no hard limit, using too many fields can lead to performance issues and very large result sets. It's generally recommended to keep the number of fields reasonable, typically no more than 3-5.

Q: Can Multi Terms Aggregation be used with nested objects?
A: Yes, Multi Terms Aggregation can be used with nested objects, but you need to wrap it in a nested aggregation to properly access the nested fields.