ClickHouse UPDATE/DELETE Mutations Performance Impact

ALTER TABLE ... UPDATE and ALTER TABLE ... DELETE are the standard mechanisms for modifying existing data in ClickHouse. Unlike in a traditional RDBMS where a row is updated in place, ClickHouse implements these operations as mutations — background jobs that rewrite entire data parts on disk. A mutation that touches a single row still rewrites every row in the data part that contains it. On large tables with many parts, this results in substantial I/O and CPU load that competes directly with merges and queries.

What This Means

Mutations in ClickHouse are not transactional row updates. They are deferred, destructive part-rewriting operations with the following properties:

  • Only available for MergeTree family engines. Log, Memory, and other engine families do not support ALTER TABLE UPDATE/DELETE.
  • Asynchronous by default. The ALTER TABLE ... UPDATE/DELETE statement returns immediately; the actual rewrite happens in the background. The mutations_sync setting (default: 0) controls this.
  • Parts are rewritten, not rows. Each affected data part is fully read, transformed, and written as a new temporary part. For Wide format parts (each column in a separate file), unaffected columns are hardlinked rather than rewritten — only mutated columns get new files. For Compact format parts (all columns in one file per part), all columns must be rewritten regardless of how many columns the mutation touches.
  • Primary key and sorting key columns cannot be updated. Columns in the ORDER BY expression are immutable via ALTER TABLE UPDATE. Attempting this throws an error.
  • Mutations execute sequentially per table. Multiple queued mutations run one after another, though ClickHouse may merge consecutive pending mutations into a single rewrite pass.
  • Mutations share the background thread pool with merges. While mutations are running, merge throughput drops and part counts can grow, potentially triggering insert throttling.

Common Causes of Performance Problems

  1. Frequent small mutations. Issuing many separate ALTER TABLE UPDATE or ALTER TABLE DELETE statements creates many queued mutations. Even though each may seem small, they serialize and each one triggers its own part-rewrite cycle.

  2. Mutations on wide tables with many parts. The more parts a table has (due to frequent small inserts or slow merges), the more individual rewrite jobs a mutation spawns.

  3. Compact format parts. Tables using Compact format (common for smaller parts) require full part rewrites on mutation even when only one column changes.

  4. Mutations running on replicated tables. In ReplicatedMergeTree, each mutation must complete on every replica, multiplying total I/O. The max_replicated_mutations_in_queue setting (default: 8) limits simultaneous mutation tasks in the replica queue.

  5. Concurrent reads during mutation. A SELECT running while a mutation is in progress may read a mix of mutated and unmutated parts, producing an inconsistent view. Mutations do not block reads.

  6. Background pool exhaustion. The number_of_free_entries_in_pool_to_execute_mutation setting (default: 20) prevents mutation scheduling when the background pool has fewer than this many free threads. If the pool is busy with merges, mutations stall.

How to Fix

1. Combine multiple column updates into a single mutation

Multiple column assignments in one ALTER TABLE UPDATE statement create a single mutation. Separate statements each create their own mutation.

-- Inefficient: two mutations, two rewrite passes
ALTER TABLE orders UPDATE status = 'closed' WHERE created_at < '2023-01-01';
ALTER TABLE orders UPDATE updated_at = now() WHERE created_at < '2023-01-01';

-- Efficient: one mutation, one rewrite pass
ALTER TABLE orders UPDATE status = 'closed', updated_at = now() WHERE created_at < '2023-01-01';

2. Wait for mutation completion when needed

By default, mutations are async. Use mutations_sync when downstream steps depend on the mutation finishing.

-- Wait for the mutation to complete on the current replica
ALTER TABLE orders UPDATE status = 'closed' WHERE id = 42 SETTINGS mutations_sync = 1;

-- Wait for the mutation to complete on all replicas
ALTER TABLE orders UPDATE status = 'closed' WHERE id = 42 SETTINGS mutations_sync = 2;

-- Set for the session
SET mutations_sync = 1;
ALTER TABLE orders UPDATE status = 'closed' WHERE created_at < '2023-01-01';

3. Use lightweight DELETE for row removal (ClickHouse 23.3+)

Lightweight DELETE (DELETE FROM syntax) was GA since ClickHouse 23.3 (it required allow_experimental_lightweight_delete = 1 before 23.3). Instead of rewriting parts immediately, it marks rows deleted via a hidden _row_exists column (UInt8: 1 = visible, 0 = deleted). SELECT queries are automatically rewritten to add PREWHERE _row_exists = 1, so deleted rows are invisible immediately. Physical removal from disk happens lazily during background merges.

-- Lightweight DELETE: marks rows deleted immediately, cleans up during merges
DELETE FROM events WHERE user_id = 12345;

-- Lightweight DELETE targeting a specific partition
DELETE FROM events IN PARTITION '202301' WHERE user_id = 12345;

-- Make lightweight DELETE asynchronous (default is synchronous, lightweight_deletes_sync = 2)
DELETE FROM events WHERE user_id = 12345 SETTINGS lightweight_deletes_sync = 0;

Note: lightweight_deletes_sync defaults to 2 (wait all replicas synchronously), which is the opposite of heavy mutation behavior. Set it to 0 for fire-and-forget deletes.

Lightweight DELETE does not work by default on tables with projections. Use the lightweight_mutation_projection_mode setting (throw / drop / rebuild) to control this.

4. Use TTL for time-based deletion

TTL-based expiration removes data during normal background merges without triggering explicit mutations:

ALTER TABLE events MODIFY TTL event_date + INTERVAL 90 DAY;

5. Redesign for append-only patterns

ClickHouse is optimized for append-only workloads. Several engine variants handle mutable data without explicit mutations:

  • ReplacingMergeTree: deduplicates rows with the same sorting key, keeping the row with the highest version value. "Updates" are inserts with a higher version.
  • CollapsingMergeTree / VersionedCollapsingMergeTree: sign-based row collapsing for streaming update patterns.
CREATE TABLE orders (
    id UInt64,
    version UInt64,
    status String,
    updated_at DateTime
) ENGINE = ReplacingMergeTree(version)
ORDER BY id;

-- "Update" by inserting a new version
INSERT INTO orders VALUES (42, 2, 'closed', now());

6. Run large mutations during low-traffic windows

Use ALTER TABLE ... ON CLUSTER during maintenance windows when query load is minimal:

ALTER TABLE events ON CLUSTER my_cluster DELETE WHERE event_date < '2022-01-01';

7. Force physical cleanup after lightweight deletes

Lightweight deletes do not immediately free disk space. To reclaim storage, trigger a merge:

OPTIMIZE TABLE events FINAL;

Use with caution on large tables — OPTIMIZE TABLE FINAL can be expensive.

8. Use a soft-delete tombstone column instead of deleting

When the requirement is "hide these rows from queries" rather than "free the disk space immediately", a user-defined tombstone column avoids the cost of rewriting parts entirely. The idea is to keep a boolean flag (commonly named is_active or is_deleted) and flip it instead of physically removing rows.

CREATE TABLE test_delete (
    key       UInt32,
    ts        UInt32,
    value_a   String,
    is_active UInt8 DEFAULT 1
) ENGINE = MergeTree ORDER BY key;

-- "Delete" a row: a single-column mutation, far cheaper than a heavy DELETE
ALTER TABLE test_delete UPDATE is_active = 0 WHERE key = 400000 SETTINGS mutations_sync = 2;

-- Application queries filter on the flag
SELECT * FROM test_delete WHERE is_active = 1;

To avoid sprinkling WHERE is_active = 1 through every query, you can enforce visibility centrally with a row policy:

CREATE ROW POLICY pol1 ON test_delete USING is_active = 1 TO all;

Tombstone column vs. lightweight DELETE. Both are "logical" deletes — neither frees disk space until a merge runs — but they differ in important ways:

  • A tombstone column is a single-column mutation (ALTER TABLE ... UPDATE is_active = 0). On Wide-format parts only the one column file is rewritten, so the operation is dramatically cheaper than a heavy ALTER TABLE DELETE, and Altinity benchmarks it at roughly 20x faster (≈0.058s vs ≈1.1s on the same data; the gap is far wider on S3-backed storage).
  • Recovery is trivial with a tombstone column — un-delete is just ALTER TABLE test_delete UPDATE is_active = 1 WHERE key = 400000. With lightweight DELETE (or any real delete) the only way back is restoring from backup.
  • The cost is query-time: every SELECT must filter the flag, and the tombstoned rows still occupy storage, so scans read more data. Row policies that inject the filter add their own overhead. Lightweight DELETE, by contrast, is invisible to queries (its _row_exists filter is applied automatically) and does not require schema or query changes.

Use a tombstone column when auditability, fast/cheap "deletes", and easy undo matter more than disk savings and query simplicity (for example, GDPR-style "deactivate" flows that may later be reversed). Use lightweight DELETE when you want deletions to disappear from queries automatically with no application awareness.

Built-in tombstone for ReplacingMergeTree (is_deleted)

If you already use ReplacingMergeTree for an append/update workload, you can attach the tombstone to its is_deleted parameter (added in ClickHouse 23.2, requires the ver column). A UInt8 column marks the latest version of a key as deleted, and the row is dropped during cleanup merges:

CREATE TABLE example (
    key         Int64,
    some_col    String,
    event_time  DateTime,
    is_deleted  UInt8
) ENGINE = ReplacingMergeTree(event_time, is_deleted)
ORDER BY key;

-- "Delete" by inserting a newer version flagged as deleted
INSERT INTO example VALUES (1, 'x', now(), 1);

Physically purging the flagged rows requires a cleanup merge: OPTIMIZE TABLE example FINAL CLEANUP (gated behind allow_experimental_replacing_merge_with_cleanup). Until then, query with FINAL so the deleted version wins. See the ClickHouse CREATE TABLE guide for engine selection and the ALTER TABLE guide for adding a tombstone column to an existing table.

Root-Cause Analysis

Check all pending mutations

SELECT
    database,
    table,
    mutation_id,
    command,
    create_time,
    parts_to_do,
    is_done,
    latest_fail_reason
FROM system.mutations
WHERE is_done = 0
ORDER BY create_time;

parts_to_do shows how many data parts still need processing. Note: for replicated tables, parts_to_do = 0 with is_done = 0 can occur when a long-running INSERT is creating a new part that the mutation still needs to process.

Check mutations for a specific table

SELECT
    mutation_id,
    command,
    create_time,
    parts_to_do,
    parts_to_do_names,
    latest_fail_reason
FROM system.mutations
WHERE database = 'mydb' AND table = 'orders' AND is_done = 0;

Check mutations across all cluster replicas

SELECT
    hostName() AS host,
    database,
    table,
    mutation_id,
    parts_to_do,
    is_done,
    latest_fail_reason
FROM clusterAllReplicas('my_cluster', system.mutations)
WHERE is_done = 0;

Find stuck or failed mutations

SELECT
    database,
    table,
    mutation_id,
    command,
    latest_fail_reason,
    latest_fail_time
FROM system.mutations
WHERE is_done = 0 AND latest_fail_reason != '';

Check current mutation-related settings

-- Session-level settings
SELECT name, value, default
FROM system.settings
WHERE name LIKE '%mutation%' OR name LIKE '%lightweight_delete%';

-- MergeTree-level settings
SELECT name, value, default
FROM system.merge_tree_settings
WHERE name LIKE '%mutation%';

Detect lightweight deletes in specific parts

SELECT part_name, column_bytes_on_disk
FROM system.parts_columns
WHERE table = 'events' AND column = '_row_exists';

Parts without any _row_exists entry have no lightweight deletes applied.

Kill a Mutation

KILL MUTATION stops further processing of a mutation. Parts already rewritten keep their changes. Parts not yet processed are skipped. This is not a rollback — the operation is irreversible for completed parts.

-- Kill all pending mutations for a table
KILL MUTATION WHERE database = 'mydb' AND table = 'orders';

-- Kill a specific mutation by ID
KILL MUTATION WHERE database = 'mydb' AND table = 'orders' AND mutation_id = 'mutation_3.txt';

-- Preview without executing
KILL MUTATION WHERE database = 'mydb' AND table = 'orders' TEST;

The mutation_id format differs: replicated tables use a znode name (e.g., 0000000003); non-replicated tables use a filename (e.g., mutation_3.txt).

On ClickHouse Cloud, killed mutations show is_killed = 1 in system.mutations. This column is not present on self-hosted deployments.

Preventive Measures

  • Design schemas around append-only inserts using ReplacingMergeTree, CollapsingMergeTree, or VersionedCollapsingMergeTree to minimize the need for explicit mutations.
  • Use TTL expressions for data expiration rather than periodic DELETE mutations.
  • Never issue mutations in application hot paths — mutations are not suitable as row-level update mechanisms.
  • Batch mutations: one mutation affecting many rows is far cheaper than many mutations each affecting a few rows.
  • Monitor system.mutations continuously and alert when parts_to_do is large or latest_fail_reason is non-empty.
  • Tune number_of_free_entries_in_pool_to_execute_mutation and max_replicated_mutations_in_queue if mutations are stalling due to pool saturation.
  • On ClickHouse 25.1+, consider apply_mutations_on_fly = 1 for workloads that issue mutations and then immediately query the affected data. This applies pending mutations in memory during SELECT execution before background materialization completes (must be explicitly enabled; not on by default).

Resolve Mutation Problems Automatically with Pulse

Pulse monitors system.mutations continuously across all nodes and replicas, alerting you when mutations are stuck, failing, or accumulating. Pulse surfaces mutation backlog, failed-part errors, and pool saturation in a single dashboard, so you can act before a mutation queue turns into a prolonged performance incident. Learn more at pulse.support.

Frequently Asked Questions

Q: Does ALTER TABLE UPDATE modify rows in place like PostgreSQL?
A: No. ClickHouse rewrites entire data parts. Even updating one row causes the entire part containing that row to be rewritten. This is fundamentally different from row-level update mechanisms in traditional RDBMS.

Q: What is the default behavior — does ALTER TABLE UPDATE block?
A: No. mutations_sync defaults to 0, meaning the statement returns immediately and the mutation runs in the background. Set mutations_sync = 1 or 2 to wait for completion.

Q: Is lightweight DELETE (DELETE FROM) the same as ALTER TABLE DELETE?
A: No. Lightweight DELETE marks rows with a hidden _row_exists column and provides immediate visibility of deletions, but does not rewrite parts immediately. ALTER TABLE DELETE is a heavy mutation that rewrites parts right away. Lightweight DELETE is generally preferred for targeted row removal on ClickHouse 23.3+.

Q: Does KILL MUTATION undo changes already applied?
A: No. KILL MUTATION only cancels processing for parts not yet mutated. Parts that have already been rewritten keep the mutation applied. There is no rollback. If you need to revert, you must restore from backup.

Q: Can I update columns that are part of the primary key?
A: No. Columns in the ORDER BY (sorting key) expression cannot be updated via ALTER TABLE UPDATE. This is a hard constraint and throws an error. If you need to change primary key values, you must delete and reinsert the rows.

Q: When should I use a soft-delete tombstone column instead of lightweight DELETE?
A: Use a tombstone column (e.g. is_active UInt8) when you need cheap, reversible "deletes" and don't care about reclaiming disk immediately — flipping a single-column flag with ALTER TABLE ... UPDATE is far cheaper than a heavy delete and is trivially undone by setting the flag back. Use lightweight DELETE when you want deleted rows to vanish from queries automatically with no application-side filtering and no schema changes. Neither frees disk until a merge runs. For ReplacingMergeTree workloads, the built-in is_deleted parameter (ClickHouse 23.2+) gives you a tombstone that is purged during OPTIMIZE TABLE ... FINAL CLEANUP.

Q: Why does parts_to_do = 0 not always mean the mutation is done?
A: For replicated tables, a mutation can show parts_to_do = 0 with is_done = 0 when a long-running INSERT is creating a new part that the mutation still needs to process. Wait until is_done = 1 before concluding the mutation has fully completed.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.