The "DB::Exception: Illegal aggregation" error in ClickHouse is raised when an aggregate function is used in a context where it is not permitted. The error code is ILLEGAL_AGGREGATION. The most common trigger is nesting aggregate functions (e.g., sum(count(*))) or using aggregate functions in clauses where they are not allowed, such as WHERE or JOIN ON.
Impact
The query fails immediately at the analysis stage without executing. No data is processed. This is a query structure issue that requires rewriting the SQL to properly separate aggregation levels.
Common Causes
- Nesting aggregate functions, such as
max(sum(column)), which ClickHouse does not allow in a single query level. - Using an aggregate function in a
WHEREclause instead ofHAVING. - Referencing an aggregate function inside a
JOINcondition. - Using aggregate functions in a
PREWHEREclause. - Combining aggregate and non-aggregate expressions in a
SELECTwithout aGROUP BYclause.
Troubleshooting and Resolution Steps
If you have nested aggregation, restructure the query using a subquery:
-- Wrong: nested aggregation -- SELECT max(sum(amount)) FROM orders GROUP BY customer_id; -- Right: use a subquery SELECT max(total_amount) FROM ( SELECT sum(amount) AS total_amount FROM orders GROUP BY customer_id );If an aggregate function is in a
WHEREclause, move it toHAVING:-- Wrong -- SELECT customer_id, sum(amount) FROM orders WHERE sum(amount) > 100 GROUP BY customer_id; -- Right SELECT customer_id, sum(amount) FROM orders GROUP BY customer_id HAVING sum(amount) > 100;If you need to filter rows based on aggregated values before joining, compute the aggregation in a subquery first:
SELECT o.customer_id, o.total_amount, c.name FROM ( SELECT customer_id, sum(amount) AS total_amount FROM orders GROUP BY customer_id ) AS o JOIN customers AS c ON o.customer_id = c.id;When mixing aggregate and non-aggregate columns in
SELECT, ensure all non-aggregated columns are in theGROUP BYclause:SELECT customer_id, status, count() FROM orders GROUP BY customer_id, status;For complex multi-level aggregations, use CTEs (Common Table Expressions) to make the query more readable:
WITH customer_totals AS ( SELECT customer_id, sum(amount) AS total FROM orders GROUP BY customer_id ) SELECT max(total) AS max_customer_total FROM customer_totals;
Best Practices
- Break complex aggregation queries into subqueries or CTEs, with each level handling one aggregation step.
- Use
HAVINGfor post-aggregation filtering rather than attempting to use aggregates inWHERE. - Review ClickHouse's SQL reference for which clauses accept aggregate functions and which do not.
- Test aggregate queries incrementally, starting with the innermost aggregation level.
Frequently Asked Questions
Q: Can I nest aggregate functions in ClickHouse?
A: No. ClickHouse does not support directly nesting aggregate functions like max(sum(x)). You need to use a subquery or CTE to perform multi-level aggregation.
Q: What is the difference between WHERE and HAVING for aggregate filtering?
A: WHERE filters individual rows before aggregation and cannot reference aggregate functions. HAVING filters groups after aggregation and can reference aggregate expressions.
Q: Can I use aggregate functions in window function expressions?
A: Yes. ClickHouse supports aggregate functions within window functions, such as sum(amount) OVER (PARTITION BY customer_id). This is different from illegal nesting of aggregate functions.
Q: Why does ClickHouse not allow nested aggregation when some other databases seem to?
A: Most SQL databases also prohibit nested aggregation at the same query level. Some may appear to allow it through implicit query rewriting, but the standard SQL approach is to use subqueries for multi-level aggregation, which ClickHouse follows.