ClickHouse FINAL Clause: Performance Trade-offs and Optimization Strategies

The FINAL modifier tells ClickHouse to fully merge data at query time so you read a deduplicated, fully-collapsed view of a MergeTree-family table. It is the correct answer for "give me the latest version of each row" on a ReplacingMergeTree, but it shifts work that normally happens in background merges into the hot path of every query — which is why it has a reputation for being slow.

This guide explains why FINAL is expensive, the settings and engine layout that make it cheap, and the alternatives (GROUP BY with argMax, LIMIT 1 BY) that often beat it. For the related error and timeout cases, see Illegal FINAL and OPTIMIZE FINAL timeouts.

What FINAL Actually Does

FINAL applies to the MergeTree engines that collapse or merge rows: ReplacingMergeTree, CollapsingMergeTree, VersionedCollapsingMergeTree, SummingMergeTree, and AggregatingMergeTree. With FINAL, ClickHouse runs the engine's merge logic across all active parts at read time, before returning rows. This guarantees correctness regardless of whether background merges have caught up.

The cost has three components:

  1. Query-time merging. Rows with the same sorting key must be gathered across every part and reconciled. This is CPU-bound merge work that would otherwise be amortized across background merges.
  2. Extra column reads. Queries with FINAL may read the primary/sorting key columns even when you did not request them, because those columns are needed to identify duplicates. Wide or heavy ORDER BY columns directly inflate this cost.
  3. Reduced parallelism / pipeline overhead. Merging is inherently more serial than a plain scan, though modern versions parallelize it substantially.

FINAL is not the same as OPTIMIZE TABLE ... FINAL. The former is a read-time modifier that changes one query's results; the latter is a write-time command that rewrites parts on disk and should be used sparingly — see OPTIMIZE FINAL timeouts and the section below.

How FINAL Performance Has Improved

FINAL is far faster on modern ClickHouse than its old reputation suggests. The key milestones, per the Altinity Knowledge Base:

Version Improvement
Pre-20.5 Single-threaded deduplication — genuinely slow
20.5+ Parallel execution of FINAL becomes possible
20.10+ do_not_merge_across_partitions_select_final setting added
22.6 / 22.8 Better parallelization; reads less data
23.5+ Lower memory usage
24.1+ Vertical algorithm (enable_vertical_final) introduced as opt-in — non-key columns from distinct parts are read in parallel and duplicates are marked rather than eagerly merged
24.6+ enable_vertical_final enabled by default
25.6+ Skip indexes can be used under FINAL (use_skip_indexes_if_final)

If you are on a recent release (24.x/25.x), assume FINAL is much cheaper than older blog posts claim — but still not free.

The Big Lever: Partitioning + Pre-Optimization

The single most effective optimization is to ensure ClickHouse never has to merge across partitions at query time.

The do_not_merge_across_partitions_select_final setting (default 0) merges parts only within the same partition, and skips merging a partition entirely when it already contains a single part with level > 0 (i.e. it has been optimized). Pattern:

-- 1. Partition so that "closed" data lives in its own partitions
CREATE TABLE events
(
    event_date Date,
    user_id    UInt64,
    payload    String,
    version    UInt64
)
ENGINE = ReplacingMergeTree(version)
PARTITION BY toYYYYMMDD(event_date)
ORDER BY (user_id);

-- 2. Pre-optimize completed (older, immutable) partitions
OPTIMIZE TABLE events PARTITION '20260601' FINAL;

-- 3. Query with the setting enabled
SELECT *
FROM events FINAL
WHERE event_date >= '2026-06-01'
SETTINGS do_not_merge_across_partitions_select_final = 1;

Once a partition is collapsed to a single part, FINAL over it does almost no work — the cost approaches that of a plain SELECT. Combine this with aggressive merge tuning so background merges keep recent partitions compact.

A caveat for newer releases: from 26.2 ClickHouse can make this decision automatically via enable_automatic_decision_for_merging_across_partitions_for_final when the partition key is part of the primary key, so check your version's defaults before setting the flag manually.

Tuning FINAL Directly

When you cannot avoid query-time merging, a few settings and design choices matter:

  • max_final_threads — caps the threads used for FINAL deduplication. On modern versions it defaults to using available cores; raising it does nothing once you are core-bound, but lowering it is useful to limit FINAL's impact on a busy cluster.
  • Keep ORDER BY columns small. Because the sorting key is always read during deduplication, lighter key columns (and fewer of them) directly speed up FINAL. A String business key is far more expensive here than a compact integer.
  • Filter on primary key columns. FINAL is most acceptable when the query also filters on the sorting key, so ClickHouse can prune granules before merging.
  • Select fewer columns. Every extra wide column read participates in the FINAL pipeline.
-- Narrow, PK-filtered FINAL is the cheap case
SELECT user_id, status
FROM events FINAL
WHERE user_id BETWEEN 1000 AND 2000;

Alternative 1: GROUP BY with argMax

If your query already needs a GROUP BY, fold deduplication into it with argMax instead of paying for FINAL. argMax(value, version) returns the value from the row with the highest version — exactly the ReplacingMergeTree semantics — in a single aggregation pass, without sorting across parts.

-- Instead of: SELECT user_id, status FROM events FINAL
SELECT
    user_id,
    argMax(status, version)  AS status,
    argMax(payload, version) AS payload
FROM events
GROUP BY user_id;

This is frequently faster than FINAL for analytical, already-aggregating queries because it avoids the merge/sort machinery. The trade-offs:

FINAL GROUP BY + argMax
Query change Add one keyword Rewrite as aggregation; one argMax per column
Best when You want raw rows, no aggregation needed You already need a GROUP BY
Wide tables Reads all selected + key columns One argMax per column gets verbose and memory-heavy
Deleted rows Honors is_deleted with ReplacingMergeTree(ver, is_deleted) You must filter deletes manually
Correctness Always matches engine semantics You implement the semantics yourself

See argMax for signature details and edge cases.

Alternative 2: LIMIT 1 BY

When you control the query and want raw rows (not aggregates), LIMIT 1 BY keeps one row per key. Combined with an explicit ORDER BY on the version column, it reproduces ReplacingMergeTree dedup and can parallelize better than FINAL:

SELECT *
FROM events
ORDER BY user_id, version DESC
LIMIT 1 BY user_id;

This requires a full sort, so it is best on filtered subsets. Like argMax, it does not automatically handle the is_deleted flag — you filter that yourself.

Choosing an Approach

Approach Use when Watch out for
FINAL You want correct, latest rows with minimal query rewrite; partitions are pre-optimized Cost scales with un-merged parts and key width
GROUP BY + argMax The query is already aggregating Verbose and memory-heavy on wide rows; you reimplement deletes
LIMIT 1 BY You want raw latest rows on a filtered set Full sort; manual delete handling
Accept eventual consistency A few stale duplicates between merges are tolerable Counts/sums will be slightly off until merges run

Avoid OPTIMIZE FINAL as a Performance Fix

A common anti-pattern is running OPTIMIZE TABLE ... FINAL to "make FINAL queries fast." Per the official guidance, OPTIMIZE ... FINAL decompresses, merges, recompresses, and rewrites every active part, ignores the ~150 GB max-merge safety limit, and can create parts so large they never merge again. Scheduling it cluster-wide leads to the timeouts covered in OPTIMIZE FINAL timeouts.

The supported pattern is targeted: OPTIMIZE TABLE t PARTITION <id> FINAL on closed partitions only, paired with do_not_merge_across_partitions_select_final = 1. Let background merges handle hot data.

Common Issues

  • FINAL is slow even on a small result set. You are likely merging across many un-merged parts or many partitions. Check system.parts for active part count, reduce parts via merge tuning, and enable do_not_merge_across_partitions_select_final with pre-optimized partitions.
  • Duplicates still appear without FINAL. Expected — ReplacingMergeTree only deduplicates on merge, which is asynchronous and never guaranteed to fully complete. See the ReplacingMergeTree duplicate gotcha.
  • Illegal FINAL error. FINAL was applied to an engine or context that does not support it; see Illegal FINAL.
  • High memory during FINAL. Wide ORDER BY keys and many parts inflate memory. Narrow the key, filter on the PK, or switch heavy aggregating queries to argMax.

Best Practices

  1. Measure before assuming. On 24.x/25.x, benchmark FINAL against argMax for your actual query — the winner depends on width, cardinality, and part count.
  2. Partition for cheap FINAL. Time-partition, pre-optimize closed partitions, and enable do_not_merge_across_partitions_select_final.
  3. Keep sorting keys lean. Small, integer keys make every FINAL query cheaper.
  4. Use argMax when you are already grouping, and LIMIT 1 BY when you want raw latest rows on a filtered set.
  5. Don't fight async merges with OPTIMIZE ... FINAL on whole tables — target closed partitions only.
  6. Verify with EXPLAIN/system.query_log to confirm part pruning and to compare data read across approaches.

How Pulse Helps

Diagnosing whether a slow query is paying for FINAL, lagging background merges, an oversized sorting key, or an avoidable cross-partition merge usually requires correlating system.query_log, system.parts, and merge metrics over time. Pulse monitors ClickHouse part counts, merge backlog, and query patterns, and flags when FINAL-heavy workloads or un-optimized partitions are driving latency — so you can decide between tuning FINAL, restructuring partitions, or moving to an argMax query with evidence rather than guesswork.

Frequently Asked Questions

Q: Is SELECT FINAL always slow?

No. On modern ClickHouse (24.x+) with the vertical algorithm and parallel execution, FINAL over pre-optimized partitions can approach the cost of a plain SELECT. It is slow mainly when there are many un-merged parts, wide sorting keys, or cross-partition merging.

Q: What is the difference between FINAL and OPTIMIZE FINAL?

FINAL is a read-time query modifier that produces deduplicated results for one query without changing data on disk. OPTIMIZE TABLE ... FINAL is a write command that physically rewrites parts and is resource-intensive; avoid running it on entire tables.

Q: Is argMax faster than FINAL?

Often, when your query already needs a GROUP BYargMax does deduplication and aggregation in one pass without merge-style sorting. For wide tables where you want raw rows, FINAL or LIMIT 1 BY is usually simpler and can be faster. Benchmark both.

Q: How do I make FINAL nearly free?

Partition the table (commonly by time), run OPTIMIZE TABLE t PARTITION <id> FINAL on closed partitions, and query with SETTINGS do_not_merge_across_partitions_select_final = 1. Single-part partitions require almost no query-time merging.

Q: Does max_final_threads make FINAL faster?

It controls how many threads FINAL uses. On recent versions it already defaults to using available cores, so raising it rarely helps once you are core-bound; lowering it is useful to cap FINAL's impact on a shared cluster.

Q: Does FINAL work on ClickHouse Cloud / SharedMergeTree?

Yes. FINAL and the related settings behave the same on SharedMergeTree as on standard MergeTree-family engines.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.