ClickHouse ships with defaults that prioritize raw speed and the original column-store semantics. A few of those defaults regularly catch users off guard once they hit production workloads: queries return zeros where they expect NULLs, TTL expiration becomes the dominant source of I/O, joins behave subtly differently than ANSI SQL. This article covers three settings worth flipping in most setups, what they actually do, and when to leave them alone.
ttl_only_drop_parts
Default: 0 (off)
Recommended for most production: 1 (on)
Scope: MergeTree table-level setting.
Without this setting, when TTL fires, ClickHouse rewrites affected parts to remove expired rows row-by-row. That is fine for small tables but expensive at scale: a part with 100 million rows, of which 1 million expired, gets entirely rewritten just to drop the expired fraction.
With ttl_only_drop_parts = 1, ClickHouse only drops a part when every row in it has expired. Parts with a mix of expired and live rows are left alone until they are fully expired or get rewritten by an unrelated merge.
The implications:
- Drastically lower TTL I/O on tables partitioned in time-aligned units (daily partitions, monthly partitions). Whole parts age out and get dropped, no rewriting required.
- You can lower
merge_with_ttl_timeoutbecause the TTL pass is now cheap. - Expired rows linger longer if a part has a mix of TTL-expired and not-yet-expired data. If you depend on TTL for hard data deletion (GDPR, compliance), test the eventual deletion window carefully.
Enable on the table:
ALTER TABLE analytics.events
MODIFY SETTING ttl_only_drop_parts = 1;
Or set on creation:
CREATE TABLE analytics.events (...) ENGINE = MergeTree
ORDER BY (...)
TTL event_time + INTERVAL 90 DAY
SETTINGS ttl_only_drop_parts = 1;
This setting works best when the table's partition key aligns with the TTL period. A daily partition on a 90-day TTL means every day a whole partition becomes fully expired and gets dropped efficiently. A monolithic table with no time-based partitioning still benefits, but less.
join_use_nulls
Default: 0
Recommended in most cases: 1
Scope: Session/profile setting.
ClickHouse's default JOIN behavior fills missing right-side columns with the default value for their type, not NULL. So an unmatched row from a LEFT JOIN returns 0 for Int64, '' for String, 1970-01-01 for Date, and so on.
This is a deliberate optimization (NULL handling adds overhead and storage), but it routinely surprises analysts coming from PostgreSQL or MySQL. A LEFT JOIN against a missing key looks like a real zero match, hiding bugs.
With join_use_nulls = 1, ClickHouse promotes joined columns to Nullable and fills unmatched rows with proper NULLs:
SET join_use_nulls = 1;
SELECT
a.user_id,
b.country
FROM users a
LEFT JOIN user_country b ON a.user_id = b.user_id
WHERE a.user_id = 999;
-- With default: country = '' for missing match
-- With join_use_nulls=1: country = NULL for missing match
This is closer to ANSI SQL and what most BI tools expect. The cost is a small amount of additional storage and CPU due to Nullable handling, usually negligible for analytical workloads.
Set globally in the user's profile or at session level. Some teams set it cluster-wide to standardize behavior across all queries.
Caveat: if existing dashboards depend on the default-value semantics (filtering WHERE country != '' to mean "joined successfully"), turning this on changes their results. Audit before flipping at scale.
aggregate_functions_null_for_empty
Default: 0
Recommended for ANSI compatibility: 1
Scope: Session/profile setting.
By default, aggregate functions over empty result sets return type-specific defaults rather than NULL:
SELECT sum(x), avg(x) FROM (SELECT 1 x WHERE 0);
-- Default: 0, nan
sum returns 0, avg returns NaN, min/max return type defaults. ANSI SQL says these should all return NULL.
With the setting flipped:
SET aggregate_functions_null_for_empty = 1;
SELECT sum(x), avg(x) FROM (SELECT 1 x WHERE 0);
-- Returns: NULL, NULL
Why it matters: dashboards and downstream systems frequently special-case NULL ("no data") differently from 0 ("zero result"). A nightly metric that says "0 sales today" when the data simply has not arrived yet is much worse than "NULL, please retry."
ClickHouse leaves this off by default because Nullable aggregate states are slightly more expensive. The performance difference is small in practice; enable it unless you have profiled a hot path that depends on the default behavior.
How to Roll These Out
For session-scoped settings (join_use_nulls, aggregate_functions_null_for_empty), put them in a profile rather than relying on individual queries:
<!-- /etc/clickhouse-server/users.d/sane_defaults.xml -->
<clickhouse>
<profiles>
<default>
<join_use_nulls>1</join_use_nulls>
<aggregate_functions_null_for_empty>1</aggregate_functions_null_for_empty>
</default>
</profiles>
</clickhouse>
For table-scoped (ttl_only_drop_parts), apply per-table via ALTER TABLE ... MODIFY SETTING or set in CREATE TABLE for new tables.
Roll out incrementally:
- Set on a staging cluster, replay representative queries, compare results to current production.
- Identify dashboards or pipelines that change behavior. Decide whether the new behavior is correct or whether to special-case them.
- Flip on production during a low-traffic window. None of these settings change on-disk data, so revert is fast.
Other Settings Worth Considering
Beyond the three above, a few more defaults are commonly retuned:
max_bytes_before_external_group_byandmax_bytes_before_external_sort: enable spilling to disk for very large group-bys and sorts. Default0(disabled) can lead to OOM on edge cases.optimize_aggregation_in_orderandoptimize_read_in_order: usually safe to enable; can dramatically speed up queries whoseORDER BYmatches the table sort key.allow_experimental_*: leave off in production unless you have a specific reason.do_not_merge_across_partitions_select_final: set to1to makeSELECT ... FINALcheaper at the cost of slightly different semantics. Useful for ReplacingMergeTree heavy workloads.
Test changes on staging. Settings interact, and what speeds up one query class can slow down another.
Common Pitfalls
- Flipping settings cluster-wide without auditing dashboards.
join_use_nullsin particular changes existing query results. Run a diff before committing. - Forgetting that table-level settings need re-application after
RESTORE. Backup/restore can drop non-default settings depending on the format. Re-check after major operational events. - Assuming
ttl_only_drop_partsdeletes data faster. It does not. It defers deletion until the whole part qualifies. For hard deletion deadlines, monitor actual data age, not policy age. - Combining
join_use_nulls = 1with implicit type coercion. Operators against Nullable types can yield Nullable results in unexpected places. Be explicit withcoalescewhere downstream consumers do not handle NULL.
How Pulse Helps Tune ClickHouse Defaults
ClickHouse settings tuning is part performance engineering and part archeology: defaults drift across versions, profiles get layered, and what's actually applied to a session is often different from what's in the config. Pulse compares actual session settings to known-good profiles, flags clusters running with defaults that conflict with the workload (heavy joins without join_use_nulls, TTL-heavy tables without ttl_only_drop_parts), and surfaces drift across replicas. Connect your ClickHouse cluster to Pulse and turn settings hygiene into a checklist.
Frequently Asked Questions
Q: Where do I see the current value of a setting?
SELECT name, value, changed, description
FROM system.settings
WHERE name IN ('join_use_nulls', 'ttl_only_drop_parts', 'aggregate_functions_null_for_empty');
For table-level settings, query system.merge_tree_settings or look at SHOW CREATE TABLE.
Q: Can I set ttl_only_drop_parts for some tables and not others?
Yes. It is a table-level MergeTree setting, applied via CREATE TABLE ... SETTINGS or ALTER TABLE ... MODIFY SETTING. Apply per table based on partition layout and TTL urgency.
Q: Will enabling join_use_nulls break existing materialized views?
If the MV uses joins and the surrounding logic depends on default-value semantics, possibly. Most MVs use simple aggregations and are unaffected. Audit your MV definitions before flipping cluster-wide.
Q: Is aggregate_functions_null_for_empty safe for materialized views?
It changes the type of aggregate function results to Nullable. If a downstream table has non-Nullable columns receiving those values, you need to wrap them in coalesce() or change the column types. Test before deploying.
Q: Are there settings I should explicitly leave at default?
The MergeTree merging settings (max_bytes_to_merge_at_max_space_in_pool, parts_to_throw_insert, etc.) are well-tuned for most workloads. Change them only with a specific problem in mind and a benchmark on representative data.
Q: How do I roll back a setting if it causes problems?
For profile settings, remove from the users.d file and reload (or restart). For table settings, ALTER TABLE ... RESET SETTING name. None of these settings change on-disk data, so rollback is metadata-only.