DDLWorker is the background thread inside clickhouse-server that executes ON CLUSTER DDL statements on each cluster node. When the queue backs up or the worker thread dies, schema changes stop propagating across the cluster, replicas drift out of sync, and queries start failing with timeout errors. This guide covers how DDLWorker uses ZooKeeper, the most common failure modes, and the queries needed to recover a stuck queue.
How DDLWorker Works
When you run a DDL query with ON CLUSTER cluster_name, the initiator node writes a task into the ZooKeeper path /clickhouse/task_queue/ddl/query-NNNNNNNNNN. Every node in the cluster runs a DDLWorker thread that watches that path, picks up tasks intended for itself, executes them locally, and writes results into the finished/ child znode. A cleaner thread (DDLWorkerClnup) removes expired tasks based on the task_max_lifetime setting.
You can verify both threads are alive with:
ps -eL | grep DDL
The output should list DDLWorker and DDLWorkerClnup. If either is missing, restart clickhouse-server and inspect the server log for stack traces around the crash.
Distributed DDL Configuration
The relevant block in config.xml:
<distributed_ddl>
<path>/clickhouse/task_queue/ddl</path>
<cleanup_delay_period>60</cleanup_delay_period>
<task_max_lifetime>604800</task_max_lifetime>
<max_tasks_in_queue>1000</max_tasks_in_queue>
<pool_size>1</pool_size>
</distributed_ddl>
| Setting | Default | Purpose |
|---|---|---|
cleanup_delay_period |
60s | How often the cleaner removes old tasks |
task_max_lifetime |
604800s (7d) | How long completed tasks stay in ZooKeeper |
max_tasks_in_queue |
1000 | Hard cap on pending tasks |
pool_size |
1 | Concurrent ON CLUSTER queries per node |
The default pool_size of 1 means DDL tasks are strictly serialized per node. A slow ALTER blocks every subsequent task.
Common Failure Modes
Hostname or IP mismatch
The most frequent cause of stuck ON CLUSTER queries. If the <remote_servers> definition lists a hostname that the node itself does not resolve to a local interface, the DDLWorker never claims tasks intended for it. Check with:
SELECT cluster, shard_num, replica_num, host_name, host_address, is_local
FROM system.clusters;
is_local should be 1 for the node's own row. On Debian and Ubuntu, /etc/hosts often maps the hostname to 127.0.1.1, which trips this check. Fix the host entry or set <interserver_http_host> explicitly.
DDLWorker thread crashed
If ps -eL | grep DDL returns nothing on a node, that node will not execute any cluster DDL. Restart clickhouse-server and capture the relevant log lines:
grep -C 40 "ddl_entry" /var/log/clickhouse-server/clickhouse-server*.log
Queue saturated by ETL pipelines
Issuing thousands of CREATE, DROP, or ALTER ... ON CLUSTER statements from an ETL job overloads the serial DDL pool. Tasks pile up in system.distributed_ddl_queue and downstream queries time out waiting for distributed_ddl_task_timeout.
The remedy is structural, not configurational. Replace per-batch cluster DDL with partition swaps:
CREATE TEMPORARY TABLE staging AS final;
INSERT INTO staging SELECT ...;
ALTER TABLE final ATTACH PARTITION ID '202401' FROM staging;
Inspecting the Queue
The system.distributed_ddl_queue table exposes each task's current state:
SELECT entry, host, status, exception_code, query_create_time
FROM system.distributed_ddl_queue
WHERE status != 'Finished'
ORDER BY query_create_time DESC;
To snapshot the queue before intervention:
CREATE TABLE default.system_distributed_ddl_queue AS
SELECT * FROM system.distributed_ddl_queue;
Drill into raw ZooKeeper state for a specific task:
SELECT name, ctime, mtime
FROM system.zookeeper
WHERE path = '/clickhouse/task_queue/ddl/query-0000001000/finished/'
SETTINGS allow_unrestricted_reads_from_keeper = 1;
Common Pitfalls
- Changing
<distributed_ddl><path>inconfig.xmlwithout restartingclickhouse-server. The server keeps watching the old path and ignores new tasks. - Setting
task_max_lifetimetoo low. Lagging replicas miss tasks that have already been cleaned up and never catch up without manual intervention. - Increasing
pool_sizeabove 1 without understanding the ordering implications. Concurrent DDL on the same table can produce inconsistent metadata across replicas. - Running
ON CLUSTERagainst clusters that include dead replicas. UseSYSTEM DROP REPLICA 'replica_name'to remove tombstones before issuing new DDL. - Treating ClickHouse Keeper as a drop-in fix. Keeper replaces ZooKeeper as the coordination backend, but DDLWorker logic and the queue semantics are identical, so the same diagnostic queries apply.
Frequently Asked Questions
Q: How do I cancel a stuck DDL task?
A: There is no KILL DDL statement. Identify the task in system.distributed_ddl_queue, then either wait for distributed_ddl_task_timeout (default 180s) to expire on the initiator or remove the corresponding znode from ZooKeeper using zkCli.sh after stopping affected nodes.
Q: Should DDLWorker work with ClickHouse Keeper?
A: Yes. Keeper is API-compatible with ZooKeeper, so DDLWorker uses it transparently. Configure <keeper_server> and point <zookeeper> at the Keeper endpoints.
Q: Why does my DDL succeed on the initiator but show as failed on other hosts?
A: The initiator only checks that the task was queued. Use SELECT * FROM system.distributed_ddl_queue WHERE entry = '...' to see per-host status. A common cause is one replica being unreachable or having a hostname mismatch.
Q: Can I run multiple ON CLUSTER queries in parallel?
A: By default no, since pool_size = 1. Raising it works for independent tables but can cause race conditions on the same table. Most production setups leave it at 1 and avoid concurrent DDL at the application layer.
Q: How do I prevent ETL jobs from flooding the DDL queue?
A: Avoid cluster DDL in hot paths. Pre-create partitioned tables once, then use INSERT and ALTER TABLE ... ATTACH PARTITION for data movement. These operations do not touch the DDL queue.