ClickHouse NULL Handling Functions: coalesce, ifNull, assumeNotNull, nullIf

ClickHouse provides a small family of functions for inspecting and manipulating NULL values: isNull, isNotNull, ifNull, nullIf, coalesce, and assumeNotNull. They look interchangeable at a glance, but they differ in semantics, in what they do to a column's nullability, and - importantly - in performance. Picking the wrong one can silently corrupt results (assumeNotNull on a real NULL) or quietly slow a query (coalesce in a hot path).

This guide unifies all of them: what each does, the type and nullability it returns, and when to reach for which. It complements the dedicated coalesce function page, which goes deeper on coalesce and ifNull specifically.

Why NULL Is Special in ClickHouse

A Nullable(T) column is not stored like a plain T. ClickHouse keeps the actual values in one subcolumn and a separate NULL mask (a UInt8 per row, exposed as the .null subcolumn) that records which rows are NULL. This is why the official docs warn that "using Nullable almost always negatively affects performance" - every read touches two subcolumns instead of one, and many optimizations are weaker on nullable data.

Two consequences flow from this design and explain most NULL "gotchas":

  1. Comparisons with NULL return NULL, not false. country = 'us' skips rows where country is NULL, because NULL = 'us' is NULL. Use IS NULL / IS NOT NULL or the functions below for NULL-aware logic.
  2. The underlying value still exists even when the mask says NULL. When the mask marks a row NULL, the value subcolumn typically holds the type's default (0, '', the first enum member). Functions that ignore the mask - namely assumeNotNull - expose that raw default, which is where undefined behavior comes from.

Before reaching for NULL-handling functions everywhere, consider whether the column needs to be Nullable at all. A non-nullable column with a sentinel default (0, -1, '') is often faster and removes an entire class of bugs.

The Functions at a Glance

Function Signature Returns Purpose
isNull isNull(x) UInt8 1 if x is NULL, else 0 (same as x IS NULL)
isNotNull isNotNull(x) UInt8 1 if x is not NULL, else 0 (same as x IS NOT NULL)
isNullable isNullable(x) UInt8 1 if x's type is Nullable, else 0 (type-level, not value-level)
ifNull ifNull(x, alt) common supertype x if not NULL, otherwise alt
nullIf nullIf(x, y) Nullable(x) NULL if x = y, otherwise x
coalesce coalesce(x[, y, ...]) common supertype leftmost non-NULL argument, or NULL if all are NULL
assumeNotNull assumeNotNull(x) non-Nullable T strips nullability; arbitrary result if x is actually NULL
toNullable toNullable(x) Nullable(T) adds nullability without changing the value

All of these are documented in the official functions-for-nulls reference.

isNull and isNotNull: Testing for NULL

isNull(x) and isNotNull(x) are the function forms of the SQL IS NULL / IS NOT NULL operators and return UInt8. They are NULL-aware - unlike =, they never return NULL themselves.

SELECT count() FROM events WHERE isNotNull(user_id);
-- identical to:
SELECT count() FROM events WHERE user_id IS NOT NULL;

For pure NULL counting on a large table, the .null subcolumn is the cheapest path because ClickHouse reads only the mask, not the values:

SELECT sum(user_id.null) AS null_count FROM events;

Note isNullable(x) is different: it answers a type question ("is this column declared Nullable?"), not a value question. It is rarely needed in application queries.

ifNull: Two-Argument Fallback

ifNull(x, alt) returns x when it is not NULL, otherwise alt. It is the two-argument special case of coalesce and reads more clearly when you have exactly one fallback.

SELECT ifNull(nickname, 'guest') AS name FROM users;

If alt is a non-nullable literal, the result is non-nullable - a clean way to drop nullability and supply a meaningful default in one step.

nullIf: Turning a Sentinel Back Into NULL

nullIf(x, y) returns NULL when x = y, otherwise x. It is the inverse of ifNull and is most useful for converting a magic sentinel value (an empty string, -1, 0) back into a real NULL.

-- Treat empty strings as missing
SELECT nullIf(email, '') AS email FROM contacts;

-- Avoid division-by-zero by nulling the denominator
SELECT revenue / nullIf(orders, 0) AS avg_order_value FROM daily_stats;

The division example is the classic idiom: x / nullIf(y, 0) yields NULL instead of raising or returning inf/nan when y is zero.

coalesce: Multi-Argument Fallback

coalesce(x[, y, ...]) returns the leftmost non-NULL argument, or NULL if every argument is NULL. Unlike some SQL databases, ClickHouse's coalesce evaluates all arguments — it does not short-circuit once a non-NULL is found. Use it when you have a chain of fallbacks; for a single fallback, prefer ifNull for readability.

SELECT coalesce(display_name, first_name, 'Anonymous') AS name FROM users;

The coalesce function page covers type-compatibility rules and WHERE-clause caveats in detail.

assumeNotNull: Strip Nullability (Carefully)

assumeNotNull(x) returns the value of a Nullable(T) column as a plain, non-Nullable T. Critically, it does not check the NULL mask. The official documentation states plainly: "If the original value is NULL, an arbitrary result can be returned." In practice that arbitrary result is usually the type's default (0, '', the first enum member), but it is not contractual - it has changed across versions and after intermediate calculations, so you must never rely on it.

Use assumeNotNull only when you can guarantee the input has no NULLs, and you want to remove the Nullable wrapper to unlock faster, simpler downstream processing. A common safe pattern is after an explicit filter:

SELECT assumeNotNull(user_id) AS user_id
FROM events
WHERE user_id IS NOT NULL;   -- guarantees no NULLs reach assumeNotNull

The Altinity Knowledge Base's assumeNotNull and friends article demonstrates the failure mode: applied to a genuinely NULL enum, assumeNotNull can surface an arbitrary enum member rather than anything meaningful. Their guidance is direct: "If it's possible to have Null values, it's better to use ifNull function instead."

The difference is intent. assumeNotNull is a promise you make to the engine ("trust me, no NULLs here") with no fallback. ifNull/coalesce are defensive - they define exactly what happens when a NULL appears.

ifNull / coalesce vs. assumeNotNull

Aspect ifNull / coalesce assumeNotNull
On a real NULL Returns your defined fallback Arbitrary/undefined value
NULL mask checked? Yes No
Result nullability Non-nullable if a non-nullable fallback is given Always non-nullable
Performance Vectorized, but heavier than a raw read Near-zero overhead (ignores the mask)
Use when NULLs are possible and you want a safe default You can prove there are no NULLs

Performance Notes

  • Nullable itself has a cost. The separate NULL-mask subcolumn means every read of a nullable column touches extra data, and the docs explicitly warn it "almost always negatively affects performance." Prefer non-nullable columns with sentinel defaults where the semantics allow.
  • assumeNotNull is essentially free, because it skips the mask entirely and just reinterprets the value subcolumn. That is exactly why it is dangerous on real NULLs - the speed comes from not checking.
  • coalesce/ifNull are vectorized but do real work per row; benchmarks on the Altinity KB show coalesce running far slower than a raw nullable scan (hundreds of millions vs. tens of billions of rows/second). In tight inner loops this matters - hoist the fallback out, or eliminate the Nullable upstream.
  • NULL functions in WHERE can defeat index pruning. Wrapping an indexed column - e.g. WHERE coalesce(country, 'us') = 'us' - blocks predicate pushdown to the MergeTree primary index and may force a full scan. Rewrite as WHERE country = 'us' OR country IS NULL.

Common Issues and Gotchas

  • Comparing with NULL silently drops rows. x = NULL and x <> 'a' both evaluate to NULL for NULL rows, so those rows vanish from a filter. Use IS NULL / IS NOT NULL or wrap with coalesce/ifNull first.
  • assumeNotNull on a NULL is a correctness bug, not an error. It does not throw - it returns garbage. The result can also differ from a plain default once intermediate computations are involved. Always gate it behind a proven no-NULL guarantee.
  • Nullable is contagious. Any expression involving a Nullable operand is itself Nullable. End fallback chains with a non-nullable literal (coalesce(a, b, 0)) when you need a non-nullable result downstream.
  • JOINs introduce NULLs you did not declare. LEFT/FULL joins fill unmatched rows with NULLs (or defaults) depending on the join_use_nulls setting. A column that is non-nullable in storage can still arrive NULL after an outer join, so do not assumeNotNull a joined column without checking this.
  • Aggregate -If combinators already ignore NULL conditions correctly, but the aggregated column's NULLs still follow standard aggregate rules (e.g. sum skips NULLs). See countIf for how conditional aggregation interacts with nullability.

Best Practices

  1. Avoid Nullable unless you genuinely need to distinguish "missing" from "zero/empty." A sentinel default is usually faster and simpler.
  2. Use ifNull for one fallback, coalesce for a chain. They are explicit and safe.
  3. Use nullIf to convert sentinels to NULL - especially nullIf(denominator, 0) to dodge division errors.
  4. Reserve assumeNotNull for proven-no-NULL contexts, ideally immediately after an IS NOT NULL filter or where the schema guarantees it. When in doubt, use ifNull instead.
  5. Keep NULL functions out of WHERE clauses on indexed columns. Rewrite to OR-style predicates so the primary index still prunes.
  6. Strip nullability early (via ifNull/assumeNotNull right after the source/JOIN) so the bulk of your pipeline runs on cheaper non-nullable columns.

How Pulse Helps

NULL-handling bugs rarely announce themselves - they show up as a query that got 10x slower after an ETL change pushed a column's NULL ratio up, or as a result that is subtly wrong because an assumeNotNull met an unexpected NULL. Pulse provides AI-powered monitoring for ClickHouse that correlates query regressions to data-shape changes, including spikes in NULL ratios that defeat predicate pushdown or quietly degrade coalesce-heavy queries. It surfaces the offending query and the data change behind it, so you can fix the root cause instead of guessing.

Frequently Asked Questions

Q: What is the difference between assumeNotNull and ifNull?

ifNull(x, alt) checks the NULL mask and returns your defined alt when x is NULL. assumeNotNull(x) skips the mask entirely - if x is actually NULL it returns an arbitrary value the docs describe as undefined. Use assumeNotNull only when you can guarantee no NULLs; otherwise use ifNull.

Q: What does assumeNotNull return for a NULL value?

Officially, "an arbitrary result." In practice it is often the type's default (0, '', first enum member), but this is not contractual and has varied across versions and after intermediate calculations. Never depend on it - filter NULLs out first or use ifNull/coalesce.

Q: How do I avoid division by zero in ClickHouse?

Wrap the denominator in nullIf: revenue / nullIf(orders, 0). When orders is 0, nullIf returns NULL, so the division yields NULL instead of an error or inf/nan.

Q: Why is my query slow after adding a Nullable column?

Nullable stores a separate NULL-mask subcolumn, so every read touches extra data and several optimizations weaken. The docs warn it "almost always negatively affects performance." Consider a non-nullable column with a sentinel default, or strip nullability early with ifNull/assumeNotNull.

Q: How do I count NULLs in a column efficiently?

Use the .null subcolumn, which reads only the mask: SELECT sum(col.null) FROM table. This is cheaper than countIf(col IS NULL) because it never reads the value subcolumn.

Q: Does coalesce evaluate all of its arguments?

Yes. Unlike some SQL databases, ClickHouse's coalesce evaluates all arguments regardless of whether an earlier one is non-NULL. There is no short-circuit optimization. If you are passing expensive expressions as fallbacks, be aware they will always be computed. This is a known ClickHouse behavior stemming from its vectorized execution model.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.