ClickHouse: Source Parts Size Greater Than Current Maximum

Q: How do I tell if this is temporary or persistent?

Watch num_postponed in system.replication_queue for the same entry over a few minutes. If it stays constant or drains, it is transient. If it grows steadily, you have a real backlog.

The message source parts size (X) is greater than the current maximum (Y) appears in the ClickHouse log or system.replication_queue when the merge scheduler refuses to start a particular merge because the combined size of its source parts exceeds the dynamically computed ceiling. This is not an error in the strict sense; it is a back-pressure mechanism. The merge is postponed, not failed, and it will run as soon as the pool drains. Understanding the formula behind the dynamic ceiling tells you whether to wait, raise a limit, or fix an upstream problem.

What Triggers the Message

ClickHouse maintains two limits for merge target size:

max_bytes_to_merge_at_max_space_in_pool: the absolute ceiling when the merge pool is mostly idle (default 150 GiB).
A dynamic ceiling that shrinks as the pool fills up. This is governed by number_of_free_entries_in_pool_to_lower_max_size_of_merge (default 8).

When the pool has fewer free entries than the threshold, ClickHouse linearly lowers the maximum mergeable size. The intent is: when many merges are running, keep new merges small so that smaller, latency-critical merges still get scheduled.

If a candidate merge has source parts whose total size exceeds the current (possibly lowered) ceiling, the scheduler logs:

source parts size (X) is greater than the current maximum (Y) for this pool

The merge stays in system.replication_queue (for Replicated tables) or is simply skipped this cycle (for non-replicated tables).

How the Dynamic Ceiling Works

Roughly, the effective maximum mergeable size is:

if free_pool_entries >= number_of_free_entries_in_pool_to_lower_max_size_of_merge:
    effective_max = max_bytes_to_merge_at_max_space_in_pool
else:
    effective_max = max_bytes_to_merge_at_max_space_in_pool *
                    (free_pool_entries / number_of_free_entries_in_pool_to_lower_max_size_of_merge)

With defaults, when the pool has 8 or more free entries, the full 150 GiB ceiling applies. As entries fall toward zero, the ceiling shrinks proportionally. At zero free entries, the effective ceiling is roughly zero, so only the smallest merges proceed.

Diagnose Why the Pool Is Full

Inspect current merge activity:

SELECT
    database,
    table,
    elapsed,
    progress,
    num_parts,
    formatReadableSize(total_size_bytes_compressed) AS size,
    is_mutation
FROM system.merges
ORDER BY elapsed DESC;

Check the pool itself:

SELECT metric, value
FROM system.metrics
WHERE metric IN (
    'BackgroundMergesAndMutationsPoolSize',
    'BackgroundMergesAndMutationsPoolTask'
);

Postponed entries in the replicated queue:

SELECT
    database,
    table,
    type,
    new_part_name,
    num_postponed,
    postpone_reason
FROM system.replication_queue
WHERE postpone_reason ILIKE '%greater than the current maximum%'
ORDER BY num_postponed DESC
LIMIT 20;

If num_postponed keeps growing on the same entry, you have a persistent problem and not just a transient backlog.

When It Is Normal

Stale replicas catching up after a restart, or a node recovering from a network blip, frequently log this message. The pool is busy fetching missed parts and running smaller merges; bigger merges wait. Once the queue drains the message stops.

Treat it as transient when:

The same merge is not postponed indefinitely (num_postponed stabilizes and the entry eventually clears).
The pool is actually busy (system.merges shows real work happening).
No spike in ingest rate or schema-driven part fragmentation.

When It Indicates a Real Problem

Treat it as persistent when one or more of the following is true.

High insert throughput producing many small parts faster than the pool can merge. Symptom: system.parts count keeps growing per partition; Too many parts looms.
Disk slowness or full disk. Merges run slow or fail, so the pool stays saturated. Check dmesg, disk I/O metrics, and system.errors.
Insufficient CPU on the host. Merges queue but threads cannot complete. CPU is pinned, the queue grows.
Schema problems. Excessive partitions, too many tables with the same merge pool, or runaway materialized views that fan out writes.
Stuck mutations. Long-running mutations occupy pool slots and keep the free entries count low. Check system.mutations WHERE NOT is_done.

Fix Strategy

Pick the cause first, then the remedy.

1. If the pool really is undersized

Raise background_pool_size to give the scheduler more concurrent slots. Also raise number_of_free_entries_in_pool_to_lower_max_size_of_merge so the dynamic ceiling stays high for longer:

<clickhouse>
  <background_pool_size>36</background_pool_size>
  <merge_tree>
    <number_of_free_entries_in_pool_to_lower_max_size_of_merge>32</number_of_free_entries_in_pool_to_lower_max_size_of_merge>
  </merge_tree>
</clickhouse>

Restart required for the server-level setting.

2. If the target part size is genuinely too small

Raise the absolute ceiling. This only helps when free disk and merge time per part are not the bottleneck:

ALTER TABLE events
MODIFY SETTING max_bytes_to_merge_at_max_space_in_pool = 322122547200;  -- 300 GiB

3. If mutations are blocking the pool

Find them:

SELECT database, table, mutation_id, command, parts_to_do, is_done, latest_fail_reason
FROM system.mutations
WHERE NOT is_done;

Either let them finish or, if they are stuck on a known failure, kill them:

KILL MUTATION WHERE database = 'mydb' AND table = 'events' AND mutation_id = '...';

4. If ingest is the cause

Batch writes more aggressively at the client. Each INSERT should create one or a few parts; never one part per row. Use async inserts or batching frameworks if you cannot control the writer.

Common Pitfalls

Raising max_bytes_to_merge_at_max_space_in_pool without fixing the pool. The absolute ceiling does not matter when the dynamic one is dragging it down to near zero.
Ignoring system.mutations. A stuck mutation can keep the pool full forever and produce this message on every other merge.
Restarting ClickHouse to "clear" the backlog. It does not clear; the queue is in Keeper for Replicated tables. The problem reappears.
Treating one log line as an emergency. This message is informational. Look at num_postponed over time before reacting.

Frequently Asked Questions

Q: Is "source parts size is greater than the current maximum" an error? A: No. It is a postponement message. The merge is deferred until pool capacity frees up. ClickHouse will retry on the next scheduling cycle.

Q: Why does the current maximum change? A: ClickHouse lowers the effective maximum mergeable size when the merge pool runs low on free entries. The lower bound is governed by number_of_free_entries_in_pool_to_lower_max_size_of_merge and the upper bound by max_bytes_to_merge_at_max_space_in_pool.

Q: Should I just raise max_bytes_to_merge_at_max_space_in_pool? A: Only if the pool actually has free entries and disk space. If the pool is saturated, the dynamic ceiling has already been lowered far below the absolute one, so raising the absolute one does nothing.

Q: How do I tell if this is temporary or persistent? A: Watch num_postponed in system.replication_queue for the same entry over a few minutes. If it stays constant or drains, it is transient. If it grows steadily, you have a real backlog.

Q: Does this affect data correctness? A: No. Postponed merges do not lose data. They just delay the consolidation of parts. Sustained postponement eventually causes Too many parts rejections on the table.