NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

ClickHouse avgIf Function: Conditional Average with Examples

The avgIf function in ClickHouse calculates the arithmetic mean of a column only over the rows that satisfy a condition you provide. It's the standard avg aggregate combined with ClickHouse's -If aggregate combinator, which gives you a one-line conditional average without subqueries or CASE WHEN gymnastics.

avgIf belongs to a family of conditional aggregates: sumIf, countIf, avgIf, minIf, maxIf, quantileIf, and so on. They share the same shape: aggregate function name plus If, with a condition as an extra argument.

Syntax

avgIf(value, condition)
  • value: the numeric expression you're averaging.
  • condition: a UInt8 (boolean) expression. Rows where this evaluates to 1 are included; 0 means skip.

The combinator form is roughly equivalent to:

avg(if(condition, value, NULL))

Same result, but avgIf lets ClickHouse skip rows directly during aggregation rather than materializing intermediate NULLs, which is faster on large data.

Practical Examples

Average response time, only for successful requests

SELECT
    toStartOfHour(timestamp) AS hour,
    avgIf(response_ms, status_code < 400) AS p_avg_ok
FROM requests
WHERE timestamp >= now() - INTERVAL 24 HOUR
GROUP BY hour
ORDER BY hour;

Hourly average response time for non-error requests, ignoring 4xx and 5xx responses in a single pass.

Multiple conditional metrics in one scan

SELECT
    region,
    countIf(status_code >= 500)              AS server_errors,
    countIf(status_code BETWEEN 400 AND 499) AS client_errors,
    avgIf(response_ms, status_code < 400)    AS avg_success_ms,
    avgIf(response_ms, status_code >= 500)   AS avg_5xx_ms,
    avgIf(bytes_sent,  status_code = 200)    AS avg_200_bytes
FROM requests
GROUP BY region;

This is where the combinators really pay off. ClickHouse runs all the conditional aggregates in a single pass over the data, which is dramatically faster than running multiple subqueries with different WHERE clauses and joining them.

Combining multiple conditions

SELECT
    avgIf(price, category = 'electronics' AND in_stock = 1) AS avg_in_stock_electronics
FROM products;

The condition is a regular boolean expression. Use AND, OR, NOT, BETWEEN, IN, or anything that returns UInt8.

Using avgIf in a HAVING clause

SELECT category, avgIf(price, in_stock = 1) AS avg_in_stock_price
FROM products
GROUP BY category
HAVING avgIf(price, in_stock = 1) > 100;

Conditional aggregates work in HAVING exactly like any other aggregate.

NULL Handling

By default avgIf ignores NULL values in the value column. The average is computed over non-NULL, condition-matching rows only. Two cases worth knowing:

  1. No rows match the condition. avgIf returns NaN, not NULL. If you need NULL instead, stack the -OrNull combinator. Note that combinators have a defined order: -OrNull precedes -If, so the function is avgOrNullIf, not avgIfOrNull:

    SELECT avgOrNullIf(price, in_stock = 1) FROM products WHERE category = 'unicorn';
    -- returns NULL if no in-stock unicorns exist (rather than NaN)
    
  2. The condition itself evaluates to NULL. A NULL condition is treated as false. Rows where the condition is NULL (for example, comparing against a Nullable column where the value happens to be NULL) are excluded.

avgIf vs. avg(if(...)) Performance

The two forms are semantically equivalent:

avgIf(value, cond)             -- combinator form
avg(if(cond, value, NULL))     -- expression form

Both produce the same result, but avgIf is generally faster because:

  • The combinator skips non-matching rows directly inside the aggregation accumulator.
  • The expression form creates an intermediate Nullable column, which adds memory and CPU overhead.

For one-off queries the difference is negligible. On wide aggregations over billions of rows, the combinator form noticeably reduces query time and memory usage.

The -If combinator works with virtually every aggregate function:

Function Purpose
countIf(cond) Count rows matching the condition
sumIf(value, cond) Sum values where condition is true
avgIf(value, cond) Average values where condition is true
minIf(value, cond) Minimum of matching values
maxIf(value, cond) Maximum of matching values
uniqIf(value, cond) Approximate distinct count of matching values
quantileIf(value, cond) Quantile (e.g., p95) of matching values
groupArrayIf(value, cond) Collect matching values into an array

You can also stack combinators, but the order matters. ClickHouse applies combinators in a fixed sequence: -Array first, then -If, then -State/-Merge, then -OrNull/-OrDefault. So the valid stacked forms are avgOrNullIf (not avgIfOrNull), avgIfState and avgIfMerge (paired for the aggregating-table pattern), and quantilesArrayIf (not quantilesIfArray). See the ClickHouse aggregate-combinators reference for the full grammar.

Common Mistakes

  1. Passing the wrong type as the condition. avgIf(price, category) fails because category (a String) isn't a UInt8. The second argument has to be a boolean expression, not a column.
  2. Expecting NULL from no-matches. Use avgOrNullIf if you need NULL instead of NaN. Some visualization tools choke on NaN and silently drop the row.
  3. Confusing avgIf with avg(...) WHERE .... WHERE filters before grouping; avgIf filters per aggregate. If you want different aggregates with different filters in one query (success-rate, error-count, average-success-latency), only the combinator form lets you do it in a single pass.
  4. Floating-point overflow on integer averages. avgIf returns Float64. For exact arithmetic on currency, average an integer cents column and divide on the client, or use Decimal aggregations.

Best Practices

  1. Use it for multi-conditional metrics. Wherever you'd be tempted to write multiple WHERE-filtered subqueries and join them, a single SELECT with several *If aggregates is faster and cleaner.
  2. Combine with -State and -Merge for materialized views. avgIfState produces an aggregating-state column you can store in a materialized view and merge later. This is the right pattern for pre-aggregated dashboards.
  3. Order tables by the columns you'll filter on. ClickHouse uses the MergeTree primary key to prune partitions and granules before avgIf runs. The query plan does the heavy lifting here.
  4. Watch the cardinality of the condition. If your condition is highly selective (1% of rows), avgIf is exactly what you want. If it matches 95% of rows, a plain avg with a WHERE is usually fine and avoids the combinator overhead.

ClickHouse Cluster Health Affects Aggregate Performance

avgIf and other aggregate combinators are CPU-bound. When dashboards using these aggregates suddenly slow down, the cause is rarely the SQL itself. It's usually the cluster: hot replicas, lagging part merges, memory pressure, or skewed shard distribution.

Pulse provides AI-powered monitoring for ClickHouse clusters. It surfaces slow merges, memory and disk pressure, replication lag, and skewed query distribution, with alerts that fire before dashboards start timing out. Start a free trial to see what it looks like on your own cluster.

Frequently Asked Questions

Q: Can I use multiple conditions inside avgIf?
A: Yes. The condition is any boolean expression: avgIf(price, category = 'electronics' AND in_stock = 1 AND price > 0).

Q: What does avgIf return when no rows match?
A: NaN. Wrap with -OrNull (avgOrNullIf — the -OrNull combinator goes before -If) if you'd prefer NULL for cleaner downstream handling.

Q: How does avgIf handle NULL values in the value column?
A: It ignores them. Only rows where the condition is true and the value is non-NULL contribute to the average.

Q: Is avgIf faster than avg(if(condition, value, NULL))?
A: Yes. The -If combinator skips non-matching rows during aggregation rather than building an intermediate Nullable column. On large tables the combinator form is meaningfully faster and uses less memory.

Q: Can I use avgIf with subqueries?
A: Yes. The condition can include subqueries: avgIf(price, id IN (SELECT id FROM whitelist)). ClickHouse executes the subquery once and treats the result as a constant for the aggregation.

Q: Can avgIf be used in a HAVING clause?
A: Yes. HAVING avgIf(price, category = 'electronics') > 100 works exactly like any other aggregate in HAVING.

Q: What's the difference between avgIf and avg with a WHERE clause?
A: WHERE filters rows globally before any aggregation. avgIf filters per aggregate. The big advantage of avgIf is that you can compute multiple aggregates with different filters in one query (avg latency for successes, count of 5xx errors, etc.), which WHERE can't do.

Q: Are there other conditional aggregates besides avgIf?
A: Yes. countIf, sumIf, minIf, maxIf, uniqIf, quantileIf, groupArrayIf, and many more. The -If combinator works with virtually every aggregate function in ClickHouse.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.