The Cardinality Aggregation in Elasticsearch is used to calculate the approximate count of unique or distinct values in a field. It's particularly useful when you need to count the number of unique items in large datasets without the need for exact precision.
Syntax
{
"aggs": {
"unique_count": {
"cardinality": {
"field": "field_name"
}
}
}
}
For more details, refer to the official Elasticsearch documentation on Cardinality Aggregation.
Example Usage
Here's an example of using the Cardinality Aggregation to count unique user IDs:
GET /my_index/_search
{
"size": 0,
"aggs": {
"unique_users": {
"cardinality": {
"field": "user_id"
}
}
}
}
Common Issues
- High memory usage: For very large datasets, cardinality aggregation can consume significant memory.
- Precision vs. Performance: The default precision_threshold of 3000 may not be suitable for all use cases.
- Null values: By default, null values are counted as a distinct value.
Best Practices
- Adjust the
precision_threshold
parameter based on your needs for accuracy vs. performance. - Use
script
parameter for complex cardinality calculations involving multiple fields. - Consider using the
HyperLogLog++
algorithm for extremely large datasets.
Frequently Asked Questions
Q: How accurate is the Cardinality Aggregation?
A: The Cardinality Aggregation provides an approximate count. It's generally accurate within 1% error for datasets with cardinality up to the precision_threshold value (default 3000).
Q: Can I use Cardinality Aggregation on nested fields?
A: Yes, you can use Cardinality Aggregation on nested fields by wrapping it in a nested aggregation.
Q: How does Cardinality Aggregation handle null values?
A: By default, null values are counted as a distinct value. You can exclude null values using a filter aggregation.
Q: What's the difference between Cardinality Aggregation and Value Count Aggregation?
A: Cardinality Aggregation counts unique values, while Value Count Aggregation counts all values, including duplicates.
Q: Can Cardinality Aggregation be used with other aggregations?
A: Yes, Cardinality Aggregation can be combined with other aggregations like terms or date histogram for more complex analytics.