Elasticsearch Terms Aggregation - Syntax, Example, and Tips

The Terms Aggregation is a multi-bucket aggregation that groups documents based on field values. It creates a bucket for each unique term in a specified field, allowing you to analyze the distribution of values within your dataset.

Syntax

"aggs": {
  "NAME": {
    "terms": {
      "field": "FIELD_NAME",
      "size": 10
    }
  }
}

For more details, refer to the official Elasticsearch documentation.

Example Usage

GET /my-index/_search
{
  "size": 0,
  "aggs": {
    "popular_colors": {
      "terms": {
        "field": "color",
        "size": 5
      }
    }
  }
}

This example retrieves the top 5 most common colors in the "color" field of the "my-index" index.

Common Issues

High cardinality fields: Terms aggregation can be memory-intensive for fields with many unique values.
Incorrect field mapping: Ensure the field is mapped as a keyword or text with fielddata enabled.
Missing values: By default, documents without the specified field are ignored.

Best Practices

Use the size parameter to limit the number of buckets returned.
Consider using the min_doc_count parameter to exclude rare terms.
For high cardinality fields, consider using the Significant Terms aggregation instead.
Use the order parameter to sort buckets by a specific metric or by key.

Frequently Asked Questions

Q: How can I get the total count of unique terms?
A: Use the cardinality aggregation instead of terms aggregation for an approximate count of unique values.

Q: Can I use Terms Aggregation on numeric fields?
A: Yes, but it's generally more appropriate to use Range or Histogram aggregations for numeric fields.

Q: How does Terms Aggregation handle case sensitivity?
A: By default, it's case-sensitive. Use a lowercase normalizer or analyzer for case-insensitive aggregations.

Q: What's the maximum number of buckets Terms Aggregation can return?
A: The default maximum is 10,000, but this can be adjusted using the index.max_terms_count index setting.

Q: How can I get terms that don't meet a minimum document count?
A: Use the min_doc_count parameter set to 0, and specify a size larger than the number of terms that meet your usual threshold.