ClickHouse does not enforce schema consistency across cluster nodes automatically. ON CLUSTER DDL is best-effort, manual CREATE TABLE runs are common, and replicas can fall out of sync after a failed node returns. The result is drift: a missing table here, a String column where it should be Int64 there, a stale dictionary on the third node. This guide gives you the queries to find drift before it causes a production incident.
Missing Tables
This query lists every (database, table) pair and the hosts where it is present, flagging any combination that is missing from at least one host:
SELECT
database,
name AS table,
groupArray(hostName()) AS hosts,
length(hosts) AS host_count
FROM clusterAllReplicas('{cluster}', system.tables)
WHERE database NOT IN ('system', 'INFORMATION_SCHEMA', 'information_schema')
GROUP BY database, name
HAVING host_count < (
SELECT count() FROM clusterAllReplicas('{cluster}', system.one)
)
SETTINGS skip_unavailable_shards = 1;
Any row in the result is a table that exists on some nodes but not others. Tables that only need to exist on one shard (rare) will show up too, so cross-check with your shard layout.
DDL Inconsistencies
A table can exist everywhere but with different definitions. The next query groups tables by (database, table, engine, create_table_query) and flags pairs that resolve to more than one CREATE statement:
SELECT
database,
name AS table,
engine,
count(DISTINCT create_table_query) AS distinct_ddls
FROM clusterAllReplicas('{cluster}', system.tables)
WHERE database NOT IN ('system', 'INFORMATION_SCHEMA', 'information_schema')
GROUP BY database, name, engine
HAVING distinct_ddls > 1
SETTINGS skip_unavailable_shards = 1;
To inspect the actual DDL differences:
SELECT hostName(), create_table_query
FROM clusterAllReplicas('{cluster}', system.tables)
WHERE database = 'analytics' AND name = 'events'
SETTINGS skip_unavailable_shards = 1;
Column Type Differences
Whole-DDL comparison misses cases where two definitions happen to be byte-identical but a column was later altered. Compare per-column types directly:
SELECT
database,
table,
name AS column,
groupArray(DISTINCT type) AS types
FROM clusterAllReplicas('{cluster}', system.columns)
WHERE database NOT IN ('system', 'INFORMATION_SCHEMA', 'information_schema')
GROUP BY database, table, name
HAVING length(types) > 1
SETTINGS skip_unavailable_shards = 1;
A typical output looks like ('Int64', 'String'), which means column A is Int64 on at least one host and String on at least one other. These are the most dangerous drifts because INSERTs and Distributed queries may succeed but return garbage.
Dictionary Consistency
Dictionaries can drift in two ways: missing on a node, or loaded with a different element count due to staleness or load failures.
SELECT
database,
name AS dictionary,
groupArray(tuple(hostName(), status, element_count) AS h) AS per_host,
arrayDistinct(per_host.3) AS distinct_element_counts
FROM clusterAllReplicas('{cluster}', system.dictionaries)
GROUP BY database, name
HAVING length(distinct_element_counts) > 1
OR length(per_host) < (
SELECT count() FROM clusterAllReplicas('{cluster}', system.one)
)
SETTINGS skip_unavailable_shards = 1;
Variations in element_count of more than a small percent usually mean one node's source query returned different data, or the dictionary failed to reload on some nodes.
Common Pitfalls
- Running these queries on a single node and assuming consistency. They must go through
clusterAllReplicas. - Treating
count(DISTINCT create_table_query)differences as real when they are caused by formatting (whitespace, comment differences). Compare actual columns and types as well. - Forgetting
skip_unavailable_shards = 1. A single down node makes the entire query fail and hides the rest of the drift. - Ignoring the
systemdatabase. While usually safe to skip, custom additions tosystemcan also drift.
Frequently Asked Questions
Q: Why do tables drift between cluster nodes?
A: ON CLUSTER DDL can partially succeed if a node is down or busy. Manual deploys, schema migrations applied to one shard, and replicas that lag behind also cause drift.
Q: How do I fix DDL drift?
A: For Replicated* tables, recreate the divergent replica from a healthy one. For non-replicated tables, decide which definition is correct and rerun CREATE TABLE (or ATTACH TABLE after dropping) on the wrong nodes.
Q: Should I automate these checks? A: Yes. Run them on a schedule and alert on any non-empty result. Drift is much easier to fix when caught early.
Q: Are dictionaries replicated automatically?
A: No. Dictionary definitions can be cluster-aware via ON CLUSTER, but the loaded data depends on each node's source query. Network or permission issues on one node lead to drift.
Q: Can I use system.replicas for this?
A: system.replicas covers Replicated* tables only and detects intra-table replica lag, not cross-cluster object drift. The two checks are complementary.