Elasticsearch Cardinality Aggregation - Syntax, Example, and Tips

Pulse - Elasticsearch Operations Done Right

On this page

Syntax Example Usage Common Issues Best Practices Frequently Asked Questions

The Cardinality Aggregation in Elasticsearch is used to calculate the approximate count of unique or distinct values in a field. It's particularly useful when you need to count the number of unique items in large datasets without the need for exact precision.

Syntax

{
  "aggs": {
    "unique_count": {
      "cardinality": {
        "field": "field_name"
      }
    }
  }
}

For more details, refer to the official Elasticsearch documentation on Cardinality Aggregation.

Example Usage

Here's an example of using the Cardinality Aggregation to count unique user IDs:

GET /my_index/_search
{
  "size": 0,
  "aggs": {
    "unique_users": {
      "cardinality": {
        "field": "user_id"
      }
    }
  }
}

Common Issues

  1. High memory usage: For very large datasets, cardinality aggregation can consume significant memory.
  2. Precision vs. Performance: The default precision_threshold of 3000 may not be suitable for all use cases.
  3. Null values: By default, null values are counted as a distinct value.

Best Practices

  1. Adjust the precision_threshold parameter based on your needs for accuracy vs. performance.
  2. Use script parameter for complex cardinality calculations involving multiple fields.
  3. Consider using the HyperLogLog++ algorithm for extremely large datasets.

Frequently Asked Questions

Q: How accurate is the Cardinality Aggregation?
A: The Cardinality Aggregation provides an approximate count. It's generally accurate within 1% error for datasets with cardinality up to the precision_threshold value (default 3000).

Q: Can I use Cardinality Aggregation on nested fields?
A: Yes, you can use Cardinality Aggregation on nested fields by wrapping it in a nested aggregation.

Q: How does Cardinality Aggregation handle null values?
A: By default, null values are counted as a distinct value. You can exclude null values using a filter aggregation.

Q: What's the difference between Cardinality Aggregation and Value Count Aggregation?
A: Cardinality Aggregation counts unique values, while Value Count Aggregation counts all values, including duplicates.

Q: Can Cardinality Aggregation be used with other aggregations?
A: Yes, Cardinality Aggregation can be combined with other aggregations like terms or date histogram for more complex analytics.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.