The avgIf function in ClickHouse calculates the arithmetic mean of a column only over the rows that satisfy a condition you provide. It's the standard avg aggregate combined with ClickHouse's -If aggregate combinator, which gives you a one-line conditional average without subqueries or CASE WHEN gymnastics.
avgIf belongs to a family of conditional aggregates: sumIf, countIf, avgIf, minIf, maxIf, quantileIf, and so on. They share the same shape: aggregate function name plus If, with a condition as an extra argument.
Syntax
avgIf(value, condition)
value: the numeric expression you're averaging.condition: aUInt8(boolean) expression. Rows where this evaluates to 1 are included; 0 means skip.
The combinator form is roughly equivalent to:
avg(if(condition, value, NULL))
Same result, but avgIf lets ClickHouse skip rows directly during aggregation rather than materializing intermediate NULLs, which is faster on large data.
Practical Examples
Average response time, only for successful requests
SELECT
toStartOfHour(timestamp) AS hour,
avgIf(response_ms, status_code < 400) AS p_avg_ok
FROM requests
WHERE timestamp >= now() - INTERVAL 24 HOUR
GROUP BY hour
ORDER BY hour;
Hourly average response time for non-error requests, ignoring 4xx and 5xx responses in a single pass.
Multiple conditional metrics in one scan
SELECT
region,
countIf(status_code >= 500) AS server_errors,
countIf(status_code BETWEEN 400 AND 499) AS client_errors,
avgIf(response_ms, status_code < 400) AS avg_success_ms,
avgIf(response_ms, status_code >= 500) AS avg_5xx_ms,
avgIf(bytes_sent, status_code = 200) AS avg_200_bytes
FROM requests
GROUP BY region;
This is where the combinators really pay off. ClickHouse runs all the conditional aggregates in a single pass over the data, which is dramatically faster than running multiple subqueries with different WHERE clauses and joining them.
Combining multiple conditions
SELECT
avgIf(price, category = 'electronics' AND in_stock = 1) AS avg_in_stock_electronics
FROM products;
The condition is a regular boolean expression. Use AND, OR, NOT, BETWEEN, IN, or anything that returns UInt8.
Using avgIf in a HAVING clause
SELECT category, avgIf(price, in_stock = 1) AS avg_in_stock_price
FROM products
GROUP BY category
HAVING avgIf(price, in_stock = 1) > 100;
Conditional aggregates work in HAVING exactly like any other aggregate.
NULL Handling
By default avgIf ignores NULL values in the value column. The average is computed over non-NULL, condition-matching rows only. Two cases worth knowing:
No rows match the condition.
avgIfreturnsNaN, notNULL. If you needNULLinstead, stack the-OrNullcombinator. Note that combinators have a defined order:-OrNullprecedes-If, so the function isavgOrNullIf, notavgIfOrNull:SELECT avgOrNullIf(price, in_stock = 1) FROM products WHERE category = 'unicorn'; -- returns NULL if no in-stock unicorns exist (rather than NaN)The condition itself evaluates to
NULL. ANULLcondition is treated as false. Rows where the condition isNULL(for example, comparing against aNullablecolumn where the value happens to beNULL) are excluded.
avgIf vs. avg(if(...)) Performance
The two forms are semantically equivalent:
avgIf(value, cond) -- combinator form
avg(if(cond, value, NULL)) -- expression form
Both produce the same result, but avgIf is generally faster because:
- The combinator skips non-matching rows directly inside the aggregation accumulator.
- The expression form creates an intermediate
Nullablecolumn, which adds memory and CPU overhead.
For one-off queries the difference is negligible. On wide aggregations over billions of rows, the combinator form noticeably reduces query time and memory usage.
Related Conditional Aggregates
The -If combinator works with virtually every aggregate function:
| Function | Purpose |
|---|---|
countIf(cond) |
Count rows matching the condition |
sumIf(value, cond) |
Sum values where condition is true |
avgIf(value, cond) |
Average values where condition is true |
minIf(value, cond) |
Minimum of matching values |
maxIf(value, cond) |
Maximum of matching values |
uniqIf(value, cond) |
Approximate distinct count of matching values |
quantileIf(value, cond) |
Quantile (e.g., p95) of matching values |
groupArrayIf(value, cond) |
Collect matching values into an array |
You can also stack combinators, but the order matters. ClickHouse applies combinators in a fixed sequence: -Array first, then -If, then -State/-Merge, then -OrNull/-OrDefault. So the valid stacked forms are avgOrNullIf (not avgIfOrNull), avgIfState and avgIfMerge (paired for the aggregating-table pattern), and quantilesArrayIf (not quantilesIfArray). See the ClickHouse aggregate-combinators reference for the full grammar.
Common Mistakes
- Passing the wrong type as the condition.
avgIf(price, category)fails becausecategory(a String) isn't aUInt8. The second argument has to be a boolean expression, not a column. - Expecting
NULLfrom no-matches. UseavgOrNullIfif you needNULLinstead ofNaN. Some visualization tools choke onNaNand silently drop the row. - Confusing
avgIfwithavg(...) WHERE ....WHEREfilters before grouping;avgIffilters per aggregate. If you want different aggregates with different filters in one query (success-rate, error-count, average-success-latency), only the combinator form lets you do it in a single pass. - Floating-point overflow on integer averages.
avgIfreturnsFloat64. For exact arithmetic on currency, average an integer cents column and divide on the client, or useDecimalaggregations.
Best Practices
- Use it for multi-conditional metrics. Wherever you'd be tempted to write multiple
WHERE-filtered subqueries and join them, a singleSELECTwith several*Ifaggregates is faster and cleaner. - Combine with
-Stateand-Mergefor materialized views.avgIfStateproduces an aggregating-state column you can store in a materialized view and merge later. This is the right pattern for pre-aggregated dashboards. - Order tables by the columns you'll filter on. ClickHouse uses the
MergeTreeprimary key to prune partitions and granules beforeavgIfruns. The query plan does the heavy lifting here. - Watch the cardinality of the condition. If your condition is highly selective (1% of rows),
avgIfis exactly what you want. If it matches 95% of rows, a plainavgwith aWHEREis usually fine and avoids the combinator overhead.
ClickHouse Cluster Health Affects Aggregate Performance
avgIf and other aggregate combinators are CPU-bound. When dashboards using these aggregates suddenly slow down, the cause is rarely the SQL itself. It's usually the cluster: hot replicas, lagging part merges, memory pressure, or skewed shard distribution.
Pulse provides AI-powered monitoring for ClickHouse clusters. It surfaces slow merges, memory and disk pressure, replication lag, and skewed query distribution, with alerts that fire before dashboards start timing out. Start a free trial to see what it looks like on your own cluster.
Frequently Asked Questions
Q: Can I use multiple conditions inside avgIf?
A: Yes. The condition is any boolean expression: avgIf(price, category = 'electronics' AND in_stock = 1 AND price > 0).
Q: What does avgIf return when no rows match?
A: NaN. Wrap with -OrNull (avgOrNullIf — the -OrNull combinator goes before -If) if you'd prefer NULL for cleaner downstream handling.
Q: How does avgIf handle NULL values in the value column?
A: It ignores them. Only rows where the condition is true and the value is non-NULL contribute to the average.
Q: Is avgIf faster than avg(if(condition, value, NULL))?
A: Yes. The -If combinator skips non-matching rows during aggregation rather than building an intermediate Nullable column. On large tables the combinator form is meaningfully faster and uses less memory.
Q: Can I use avgIf with subqueries?
A: Yes. The condition can include subqueries: avgIf(price, id IN (SELECT id FROM whitelist)). ClickHouse executes the subquery once and treats the result as a constant for the aggregation.
Q: Can avgIf be used in a HAVING clause?
A: Yes. HAVING avgIf(price, category = 'electronics') > 100 works exactly like any other aggregate in HAVING.
Q: What's the difference between avgIf and avg with a WHERE clause?
A: WHERE filters rows globally before any aggregation. avgIf filters per aggregate. The big advantage of avgIf is that you can compute multiple aggregates with different filters in one query (avg latency for successes, count of 5xx errors, etc.), which WHERE can't do.
Q: Are there other conditional aggregates besides avgIf?
A: Yes. countIf, sumIf, minIf, maxIf, uniqIf, quantileIf, groupArrayIf, and many more. The -If combinator works with virtually every aggregate function in ClickHouse.
Related Reading
- ClickHouse sumIf Function: sister combinator for conditional sums
- ClickHouse countIf Function: conditional row counting
- ClickHouse avg Function: the unconditional average
- ClickHouse documentation: full ClickHouse knowledge base