MergeTree Part Naming and Multiversion Concurrency Control (MVCC)

Every piece of data in a ClickHouse MergeTree table lives in an immutable, on-disk part, and each part has a structured name that encodes its entire history. Reading those names — and understanding why parts are immutable — is the key to reasoning about how ClickHouse stays consistent while inserts, merges, and mutations all happen at once.

This guide breaks down the MergeTree part name format component by component, then explains how immutable parts implement multiversion concurrency control (MVCC) and snapshot isolation. For background on the storage engine itself, see the MergeTree overview and how ClickHouse indexes work.

The Part Name Format

A MergeTree part directory name follows this format:

<partition_id>_<min_block_number>_<max_block_number>_<level>_<data_version>

For example, a part named 202203_1_5_1_7 decodes as:

Field Value Meaning
partition_id 202203 The partition this part belongs to (here, March 2022 from toYYYYMM(date))
min_block_number 1 Lowest block number covered by this part
max_block_number 5 Highest block number covered by this part
level 1 Merge depth — how many rounds of merging produced this part
data_version 7 Mutation version applied to this part

The fifth field, data_version, is only present once a part has been involved in a mutation; freshly inserted, never-mutated parts often appear with just four fields (e.g., 202203_1_1_0). All of these values are exposed as columns in `system.parts`.

Partition ID

The partition ID is derived from the table's PARTITION BY expression. With PARTITION BY toYYYYMM(date), a row from March 2022 lands in partition 202203. With no PARTITION BY clause, all data goes into a single partition named all.

Block Numbers

ClickHouse assigns each INSERT (more precisely, each inserted block of rows) a monotonically increasing block number. For a non-replicated table these counters are local to the server (a single global atomic counter across all partitions); for a ReplicatedMergeTree table they are coordinated through ClickHouse Keeper/ZooKeeper on a per-partition basis — each partition maintains its own counter under /block_numbers/<partition_id>/ in Keeper — so all replicas agree on the sequence within each partition.

A brand-new part from a single insert covers exactly one block, so its min_block_number and max_block_number are equal — for example 202203_1_1_0. When parts are merged, the new part spans the full range of block numbers from its inputs. Merging 202203_1_1_0, 202203_2_2_0, and 202203_3_3_0 produces a part covering blocks 1 through 3: 202203_1_3_1.

Merge Level

level is the merge depth. A part created directly by an INSERT has level 0. Each time ClickHouse merges parts, the result takes a level one higher than the maximum level of its inputs. Level is a useful signal: a high level means a part has already been merged many times and is unlikely to be merged again soon.

Data Version

data_version tracks mutations. When you run an ALTER TABLE ... UPDATE or ALTER TABLE ... DELETE, ClickHouse assigns the mutation a block number and rewrites affected parts with that number as their new data_version. The rule ClickHouse follows is simple: a mutation with version N must be applied to any part whose data_version is lower than N. This is how ClickHouse knows which parts still need a pending mutation and which have already absorbed it. For the cost implications, see mutation performance impact.

Why Parts Are Immutable

The central design choice behind all of this is that parts are never modified in place. ClickHouse does not update, delete, or rewrite data inside an existing part. Instead:

  • An INSERT writes a brand-new part.
  • A merge reads several existing parts and writes one new, larger part.
  • A mutation reads affected parts and writes new parts with the changes applied.

In every case the original parts are left untouched on disk and simply marked inactive, while the new part becomes active. The active column in system.parts is the flag that distinguishes the two: active parts make up the current logical state of the table; inactive parts are old versions waiting to be cleaned up.

-- See active vs. inactive parts for a table
SELECT
    name,
    active,
    level,
    data_version,
    rows,
    formatReadableSize(bytes_on_disk) AS size
FROM system.parts
WHERE table = 'events'
ORDER BY partition_id, min_block_number;

Immutability is what makes ClickHouse's write path fast (append-only, no in-place locking) and what makes concurrency control possible without row-level locks.

How Parts Implement MVCC

Because parts are immutable and versioned by their block-number ranges, ClickHouse gets multiversion concurrency control almost for free.

Snapshot Isolation for Queries

When a query begins, ClickHouse takes a snapshot of all active parts at that moment. The query reads only those parts for its entire duration. Any new parts created afterward — by a concurrent INSERT, a background merge, or a mutation — are simply not part of the snapshot and do not affect the running query. This gives readers a consistent view without blocking writers.

To make sure the parts in a snapshot are not deleted out from under a running query, ClickHouse reference-counts them: the refcount column in system.parts increases while a part is in use by a query or a merge. A part is only physically removed once its reference count drops and it is no longer active.

-- Parts currently held by queries or merges have refcount > 2
SELECT name, active, refcount, remove_time
FROM system.parts
WHERE table = 'events' AND refcount > 1;

Multiple Versions Coexisting During a Merge

During a background merge, both the old (input) parts and the new (output) part exist on disk at the same time:

  1. The merge reads the input parts and writes a new part covering their combined block range.
  2. Once the new part is fully written and committed, it is marked active.
  3. The input parts are atomically marked inactive in the same step.
  4. Queries that started before the swap keep reading the old parts (held by refcount); queries that start after read the new part.

Because the new part's block range fully covers the ranges of the parts it replaced, ClickHouse can always tell which inactive parts are superseded. The remove_time column records when a part became inactive; a cleanup thread later deletes inactive parts from disk once they are unreferenced (default retention is governed by old_parts_lifetime, 8 minutes by default).

Mutations and Versioning

Mutations follow the same pattern but advance data_version. A DELETE or UPDATE does not touch rows in place — it schedules a rewrite of every part whose data_version is below the mutation's version, producing new parts with the higher version. Until the rewrite finishes, the old parts remain active and serve the previous state, so queries never see a half-applied mutation. You can watch progress in system.mutations.

-- Track in-flight mutations and which parts still need rewriting
SELECT
    mutation_id,
    command,
    parts_to_do,
    is_done,
    latest_fail_reason
FROM system.mutations
WHERE table = 'events' AND is_done = 0;

Common Issues

"Too many parts" and high part counts

Each INSERT creates at least one part. Inserting many small batches produces many small level-0 parts, which the merge scheduler must constantly combine. If inserts outpace merges you hit the too_many_parts threshold. The fix is to batch inserts into larger blocks (thousands to hundreds of thousands of rows per insert) rather than streaming row-by-row.

Inactive parts not disappearing

Seeing inactive parts in system.parts is normal — they are retained for old_parts_lifetime (8 minutes) and only removed once unreferenced. If inactive parts pile up far beyond that, check for long-running queries holding references (refcount) or stuck merges in system.merges.

Detached parts

A part with a corrupt or unexpected name, or one that fails a consistency check, may be moved to a detached/ directory instead of being loaded. These are excluded from queries entirely. See cleaning up detached parts for how to inspect and reattach or drop them. Errors like Bad data part name usually mean a directory name does not match the expected partition_min_max_level_version format.

Stuck or slow mutations

A mutation rewrites whole parts, so it is far heavier than an OLTP update. If parts_to_do in system.mutations is not decreasing, the mutation may be blocked behind merges or failing repeatedly (latest_fail_reason). For deduplication-style updates, consider ReplacingMergeTree instead of frequent mutations.

Best Practices

  1. Read part names to diagnose, don't parse them in code. Use system.parts columns (min_block_number, max_block_number, level, data_version, active) rather than string-splitting the name.
  2. Insert in large batches. Fewer, larger parts mean less merge pressure and lower part counts.
  3. Treat mutations as bulk operations. They rewrite entire parts; avoid them in hot paths. Use specialized engines (Replacing/Collapsing/VersionedCollapsing) for upsert and dedup patterns.
  4. Monitor active vs. inactive parts and merge activity via system.parts, system.merges, and `system.part_log` to catch merge backlogs early.
  5. Don't manually delete part directories. Let ClickHouse manage immutability and cleanup; manual deletion corrupts the active-part set.

How Pulse Helps

Decoding part lifecycles by hand across many tables and replicas is tedious, and the symptoms — climbing part counts, lagging merges, stuck mutations, ballooning inactive parts — often surface only once they start hurting query latency. Pulse continuously monitors ClickHouse part and merge activity, surfaces tables with unhealthy part counts or stalled mutations, and ties those signals back to the inserts and queries that caused them, so you can act before a merge backlog turns into a "too many parts" outage. Learn more at pulse.support.

Frequently Asked Questions

Q: What does a part name like 202203_1_5_1 mean?

It is partition_id=202203, min_block_number=1, max_block_number=5, level=1. This part covers blocks 1 through 5 and was produced by one round of merging (level 1). With no fifth field, it has never been mutated.

Q: When does the fifth field (data_version) appear in a part name?

After a mutation. A freshly inserted or merged part that has never been mutated typically shows four fields. Once an ALTER TABLE ... UPDATE/DELETE rewrites the part, the mutation's block number becomes the part's data_version, shown as the fifth segment.

Q: Why do I see both active and inactive copies of the same data?

ClickHouse never modifies parts in place. After a merge or mutation, the new part is marked active and the originals are marked inactive (active = 0). Inactive parts are kept briefly (default old_parts_lifetime = 8 minutes) and removed once no query or merge references them.

Q: How does ClickHouse keep queries consistent while merges run?

Each query operates on a snapshot of the active parts taken when it starts, and reference-counts those parts so they cannot be deleted mid-query. New parts from concurrent inserts, merges, or mutations are not in the snapshot, giving snapshot isolation without locking writers.

Q: Are block numbers global or per-partition?

It depends on the engine. For plain MergeTree, the counter is a single global atomic increment local to the server — block numbers are unique across all partitions. For ReplicatedMergeTree, the counter is maintained per partition in Keeper/ZooKeeper (under /block_numbers/<partition_id>/), so each partition has its own independent sequence and all replicas agree on that per-partition sequence. In both cases, merged parts span the full min–max range of the block numbers they absorbed.

Q: How do I find parts that still have a mutation pending?

Compare each part's data_version against pending mutation versions, or simply query system.mutations where is_done = 0 and inspect parts_to_do. Parts with a data_version lower than a pending mutation's version are the ones still awaiting rewrite.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.