NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

ClickHouse: Remove Empty Partitions from Replication Queue

When ClickHouse fails to fetch a part on a replica, the corresponding task stays in system.replication_queue and retries forever. If the partition behind that task is empty everywhere in the cluster, there is nothing useful to fetch, but the queue keeps growing and pollutes monitoring. The recipe below cross-references system.replication_queue against active parts on every replica, then prints the exact ALTER TABLE ... DROP PARTITION commands needed to clear the stuck entries.

Run this as a periodic operational check on ReplicatedMergeTree clusters that have experienced ZooKeeper outages, replica resyncs, or aborted mutations.

Generate the DROP PARTITION Statements

Replace {cluster} with your cluster name from system.clusters before executing. The output is a list of DDL statements you can review and run.

SELECT 'ALTER TABLE ' || database || '.' || table || ' DROP PARTITION ID \''|| partition_id || '\';'
FROM
(
    SELECT DISTINCT database, table, extract(new_part_name, '^[^_]+') AS partition_id
    FROM clusterAllReplicas('{cluster}', system.replication_queue)
) AS rq
LEFT JOIN
(
    SELECT database, table, partition_id, sum(rows) AS rows_count, count() AS part_count
    FROM clusterAllReplicas('{cluster}', system.parts)
    WHERE active
    GROUP BY database, table, partition_id
) AS p
USING (database, table, partition_id)
WHERE p.rows_count = 0 AND p.part_count = 0
FORMAT TSVRaw;

How the Query Works

The query runs in two stages and joins the results:

  1. The first subquery scans system.replication_queue on every replica via clusterAllReplicas. It extracts the partition ID from new_part_name using the regex ^[^_]+, which captures everything before the first underscore (the partition prefix of the part name).
  2. The second subquery aggregates system.parts cluster-wide, summing rows and counting parts per partition. Only active parts are considered.
  3. The LEFT JOIN keeps every queue entry, then the WHERE clause filters down to partitions where both rows_count and part_count are zero, meaning no replica holds data for that partition.
  4. FORMAT TSVRaw prints the generated SQL without quoting, so you can pipe it directly into clickhouse-client.

Reviewing and Running the Output

Step Action
1 Save the output to a file: clickhouse-client -q "..." > drops.sql
2 Open the file and confirm each table and partition ID is expected
3 Run statements one at a time, or in batches, against a single replica
4 Re-check system.replication_queue to confirm entries cleared

The DROP PARTITION command is replicated automatically through ZooKeeper, so you only execute it on one node. After replication catches up, the matching queue tasks disappear on every replica.

Verifying the Queue is Clean

After dropping the empty partitions, re-run a short diagnostic to confirm the queue is shrinking and no new errors appeared:

SELECT database, table, count() AS pending_tasks, max(num_tries) AS max_retries
FROM clusterAllReplicas('{cluster}', system.replication_queue)
GROUP BY database, table
ORDER BY pending_tasks DESC;

A healthy ReplicatedMergeTree cluster should report low pending_tasks and max_retries close to zero. Sustained growth indicates a different problem, such as a missing source part or a ZooKeeper session issue.

Common Pitfalls

  • Running DROP PARTITION on a partition that is empty on most replicas but still holds rows on one is destructive. The cluster-wide aggregation in the query protects against this, but only if every replica is reachable when the SELECT runs.
  • Using extract(new_part_name, '^[^_]+') assumes the standard part naming convention. Tables created with custom partition keys that include underscores may need a different regex.
  • The cluster name in {cluster} is case-sensitive and must match system.clusters.cluster exactly.
  • If you have multiple ClickHouse clusters defined, point the query at the cluster that owns the affected tables.

Frequently Asked Questions

Q: Why are these partitions empty but still queued? A: Common causes are a DROP PARTITION that ran on the source replica before the target finished fetching, an aborted mutation that left a queue entry pointing to a part that was never produced, or a ZooKeeper rollback that referenced a part no replica retained.

Q: Will dropping a partition that does not exist cause an error? A: ClickHouse treats DROP PARTITION for a non-existent partition as a no-op on most modern versions. The replicated DDL still flushes the related queue entries.

Q: Can I automate this in a cron job? A: Yes, but keep a manual review step. Pipe the SELECT output to a file, sanity-check it, then apply. Automating both the generation and execution risks dropping partitions that briefly appear empty during a fetch or merge window.

Q: Should I use SYSTEM DROP REPLICA instead? A: No. SYSTEM DROP REPLICA removes an entire replica from ZooKeeper. Use it when a replica server is permanently gone, not when individual partitions are empty.

Q: What if new_part_name is empty in the queue row? A: Some task types (such as MUTATE_PART) populate different columns. Filter by type if you only want GET_PART entries: add WHERE type = 'GET_PART' to the inner SELECT.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.