ClickHouse provides a small family of functions for inspecting and manipulating NULL values: isNull, isNotNull, ifNull, nullIf, coalesce, and assumeNotNull. They look interchangeable at a glance, but they differ in semantics, in what they do to a column's nullability, and - importantly - in performance. Picking the wrong one can silently corrupt results (assumeNotNull on a real NULL) or quietly slow a query (coalesce in a hot path).
This guide unifies all of them: what each does, the type and nullability it returns, and when to reach for which. It complements the dedicated coalesce function page, which goes deeper on coalesce and ifNull specifically.
Why NULL Is Special in ClickHouse
A Nullable(T) column is not stored like a plain T. ClickHouse keeps the actual values in one subcolumn and a separate NULL mask (a UInt8 per row, exposed as the .null subcolumn) that records which rows are NULL. This is why the official docs warn that "using Nullable almost always negatively affects performance" - every read touches two subcolumns instead of one, and many optimizations are weaker on nullable data.
Two consequences flow from this design and explain most NULL "gotchas":
- Comparisons with
NULLreturnNULL, notfalse.country = 'us'skips rows wherecountryisNULL, becauseNULL = 'us'isNULL. UseIS NULL/IS NOT NULLor the functions below for NULL-aware logic. - The underlying value still exists even when the mask says NULL. When the mask marks a row NULL, the value subcolumn typically holds the type's default (
0,'', the first enum member). Functions that ignore the mask - namelyassumeNotNull- expose that raw default, which is where undefined behavior comes from.
Before reaching for NULL-handling functions everywhere, consider whether the column needs to be Nullable at all. A non-nullable column with a sentinel default (0, -1, '') is often faster and removes an entire class of bugs.
The Functions at a Glance
| Function | Signature | Returns | Purpose |
|---|---|---|---|
isNull |
isNull(x) |
UInt8 |
1 if x is NULL, else 0 (same as x IS NULL) |
isNotNull |
isNotNull(x) |
UInt8 |
1 if x is not NULL, else 0 (same as x IS NOT NULL) |
isNullable |
isNullable(x) |
UInt8 |
1 if x's type is Nullable, else 0 (type-level, not value-level) |
ifNull |
ifNull(x, alt) |
common supertype | x if not NULL, otherwise alt |
nullIf |
nullIf(x, y) |
Nullable(x) |
NULL if x = y, otherwise x |
coalesce |
coalesce(x[, y, ...]) |
common supertype | leftmost non-NULL argument, or NULL if all are NULL |
assumeNotNull |
assumeNotNull(x) |
non-Nullable T |
strips nullability; arbitrary result if x is actually NULL |
toNullable |
toNullable(x) |
Nullable(T) |
adds nullability without changing the value |
All of these are documented in the official functions-for-nulls reference.
isNull and isNotNull: Testing for NULL
isNull(x) and isNotNull(x) are the function forms of the SQL IS NULL / IS NOT NULL operators and return UInt8. They are NULL-aware - unlike =, they never return NULL themselves.
SELECT count() FROM events WHERE isNotNull(user_id);
-- identical to:
SELECT count() FROM events WHERE user_id IS NOT NULL;
For pure NULL counting on a large table, the .null subcolumn is the cheapest path because ClickHouse reads only the mask, not the values:
SELECT sum(user_id.null) AS null_count FROM events;
Note isNullable(x) is different: it answers a type question ("is this column declared Nullable?"), not a value question. It is rarely needed in application queries.
ifNull: Two-Argument Fallback
ifNull(x, alt) returns x when it is not NULL, otherwise alt. It is the two-argument special case of coalesce and reads more clearly when you have exactly one fallback.
SELECT ifNull(nickname, 'guest') AS name FROM users;
If alt is a non-nullable literal, the result is non-nullable - a clean way to drop nullability and supply a meaningful default in one step.
nullIf: Turning a Sentinel Back Into NULL
nullIf(x, y) returns NULL when x = y, otherwise x. It is the inverse of ifNull and is most useful for converting a magic sentinel value (an empty string, -1, 0) back into a real NULL.
-- Treat empty strings as missing
SELECT nullIf(email, '') AS email FROM contacts;
-- Avoid division-by-zero by nulling the denominator
SELECT revenue / nullIf(orders, 0) AS avg_order_value FROM daily_stats;
The division example is the classic idiom: x / nullIf(y, 0) yields NULL instead of raising or returning inf/nan when y is zero.
coalesce: Multi-Argument Fallback
coalesce(x[, y, ...]) returns the leftmost non-NULL argument, or NULL if every argument is NULL. Unlike some SQL databases, ClickHouse's coalesce evaluates all arguments — it does not short-circuit once a non-NULL is found. Use it when you have a chain of fallbacks; for a single fallback, prefer ifNull for readability.
SELECT coalesce(display_name, first_name, 'Anonymous') AS name FROM users;
The coalesce function page covers type-compatibility rules and WHERE-clause caveats in detail.
assumeNotNull: Strip Nullability (Carefully)
assumeNotNull(x) returns the value of a Nullable(T) column as a plain, non-Nullable T. Critically, it does not check the NULL mask. The official documentation states plainly: "If the original value is NULL, an arbitrary result can be returned." In practice that arbitrary result is usually the type's default (0, '', the first enum member), but it is not contractual - it has changed across versions and after intermediate calculations, so you must never rely on it.
Use assumeNotNull only when you can guarantee the input has no NULLs, and you want to remove the Nullable wrapper to unlock faster, simpler downstream processing. A common safe pattern is after an explicit filter:
SELECT assumeNotNull(user_id) AS user_id
FROM events
WHERE user_id IS NOT NULL; -- guarantees no NULLs reach assumeNotNull
The Altinity Knowledge Base's assumeNotNull and friends article demonstrates the failure mode: applied to a genuinely NULL enum, assumeNotNull can surface an arbitrary enum member rather than anything meaningful. Their guidance is direct: "If it's possible to have Null values, it's better to use ifNull function instead."
The difference is intent. assumeNotNull is a promise you make to the engine ("trust me, no NULLs here") with no fallback. ifNull/coalesce are defensive - they define exactly what happens when a NULL appears.
ifNull / coalesce vs. assumeNotNull
| Aspect | ifNull / coalesce |
assumeNotNull |
|---|---|---|
| On a real NULL | Returns your defined fallback | Arbitrary/undefined value |
| NULL mask checked? | Yes | No |
| Result nullability | Non-nullable if a non-nullable fallback is given | Always non-nullable |
| Performance | Vectorized, but heavier than a raw read | Near-zero overhead (ignores the mask) |
| Use when | NULLs are possible and you want a safe default | You can prove there are no NULLs |
Performance Notes
Nullableitself has a cost. The separate NULL-mask subcolumn means every read of a nullable column touches extra data, and the docs explicitly warn it "almost always negatively affects performance." Prefer non-nullable columns with sentinel defaults where the semantics allow.assumeNotNullis essentially free, because it skips the mask entirely and just reinterprets the value subcolumn. That is exactly why it is dangerous on real NULLs - the speed comes from not checking.coalesce/ifNullare vectorized but do real work per row; benchmarks on the Altinity KB showcoalescerunning far slower than a raw nullable scan (hundreds of millions vs. tens of billions of rows/second). In tight inner loops this matters - hoist the fallback out, or eliminate theNullableupstream.- NULL functions in
WHEREcan defeat index pruning. Wrapping an indexed column - e.g.WHERE coalesce(country, 'us') = 'us'- blocks predicate pushdown to the MergeTree primary index and may force a full scan. Rewrite asWHERE country = 'us' OR country IS NULL.
Common Issues and Gotchas
- Comparing with
NULLsilently drops rows.x = NULLandx <> 'a'both evaluate toNULLfor NULL rows, so those rows vanish from a filter. UseIS NULL/IS NOT NULLor wrap withcoalesce/ifNullfirst. assumeNotNullon a NULL is a correctness bug, not an error. It does not throw - it returns garbage. The result can also differ from a plain default once intermediate computations are involved. Always gate it behind a proven no-NULL guarantee.Nullableis contagious. Any expression involving aNullableoperand is itselfNullable. End fallback chains with a non-nullable literal (coalesce(a, b, 0)) when you need a non-nullable result downstream.- JOINs introduce NULLs you did not declare.
LEFT/FULLjoins fill unmatched rows with NULLs (or defaults) depending on the join_use_nulls setting. A column that is non-nullable in storage can still arrive NULL after an outer join, so do notassumeNotNulla joined column without checking this. - Aggregate
-Ifcombinators already ignore NULL conditions correctly, but the aggregated column's NULLs still follow standard aggregate rules (e.g.sumskips NULLs). See countIf for how conditional aggregation interacts with nullability.
Best Practices
- Avoid
Nullableunless you genuinely need to distinguish "missing" from "zero/empty." A sentinel default is usually faster and simpler. - Use
ifNullfor one fallback,coalescefor a chain. They are explicit and safe. - Use
nullIfto convert sentinels to NULL - especiallynullIf(denominator, 0)to dodge division errors. - Reserve
assumeNotNullfor proven-no-NULL contexts, ideally immediately after anIS NOT NULLfilter or where the schema guarantees it. When in doubt, useifNullinstead. - Keep NULL functions out of
WHEREclauses on indexed columns. Rewrite to OR-style predicates so the primary index still prunes. - Strip nullability early (via
ifNull/assumeNotNullright after the source/JOIN) so the bulk of your pipeline runs on cheaper non-nullable columns.
How Pulse Helps
NULL-handling bugs rarely announce themselves - they show up as a query that got 10x slower after an ETL change pushed a column's NULL ratio up, or as a result that is subtly wrong because an assumeNotNull met an unexpected NULL. Pulse provides AI-powered monitoring for ClickHouse that correlates query regressions to data-shape changes, including spikes in NULL ratios that defeat predicate pushdown or quietly degrade coalesce-heavy queries. It surfaces the offending query and the data change behind it, so you can fix the root cause instead of guessing.
Frequently Asked Questions
Q: What is the difference between assumeNotNull and ifNull?
ifNull(x, alt) checks the NULL mask and returns your defined alt when x is NULL. assumeNotNull(x) skips the mask entirely - if x is actually NULL it returns an arbitrary value the docs describe as undefined. Use assumeNotNull only when you can guarantee no NULLs; otherwise use ifNull.
Q: What does assumeNotNull return for a NULL value?
Officially, "an arbitrary result." In practice it is often the type's default (0, '', first enum member), but this is not contractual and has varied across versions and after intermediate calculations. Never depend on it - filter NULLs out first or use ifNull/coalesce.
Q: How do I avoid division by zero in ClickHouse?
Wrap the denominator in nullIf: revenue / nullIf(orders, 0). When orders is 0, nullIf returns NULL, so the division yields NULL instead of an error or inf/nan.
Q: Why is my query slow after adding a Nullable column?
Nullable stores a separate NULL-mask subcolumn, so every read touches extra data and several optimizations weaken. The docs warn it "almost always negatively affects performance." Consider a non-nullable column with a sentinel default, or strip nullability early with ifNull/assumeNotNull.
Q: How do I count NULLs in a column efficiently?
Use the .null subcolumn, which reads only the mask: SELECT sum(col.null) FROM table. This is cheaper than countIf(col IS NULL) because it never reads the value subcolumn.
Q: Does coalesce evaluate all of its arguments?
Yes. Unlike some SQL databases, ClickHouse's coalesce evaluates all arguments regardless of whether an earlier one is non-NULL. There is no short-circuit optimization. If you are passing expensive expressions as fallbacks, be aware they will always be computed. This is a known ClickHouse behavior stemming from its vectorized execution model.