The "DB::Exception: Inconsistent metadata for backup" error in ClickHouse occurs when the backup process detects that the metadata it has collected is internally inconsistent. The error code is INCONSISTENT_METADATA_FOR_BACKUP (code 663). This typically means that the table structures, dependencies, or part references changed while the backup metadata was being assembled, resulting in a state that cannot be safely backed up.
Impact
The backup operation fails and no backup is produced. This can leave your backup schedule with a gap, which is particularly concerning for disaster recovery. If backups fail repeatedly, you risk not having a recent restore point. The original data on the server is not affected by this error -- only the backup process is interrupted.
Common Causes
- Concurrent DDL operations during backup -- Running ALTER TABLE, DROP TABLE, RENAME, or other schema-changing operations while a backup is in progress can cause metadata to shift mid-backup.
- Active mutations -- Ongoing ALTER TABLE ... UPDATE or ALTER TABLE ... DELETE mutations can change the set of parts and their metadata during backup collection.
- Replication lag or conflicts -- On ReplicatedMergeTree tables, replication activity can modify parts and metadata concurrently with the backup.
- Rapid merge activity -- Aggressive background merges can replace parts that the backup process has already catalogued, causing references to become stale.
- Dependent objects out of sync -- Views, dictionaries, or other objects that reference tables being backed up may have metadata that doesn't match the current table state.
- Partial previous backup state -- Incremental backups that reference a base backup with inconsistent or corrupted metadata.
Troubleshooting and Resolution Steps
Retry the backup. Transient metadata inconsistencies caused by concurrent operations may not recur:
BACKUP TABLE your_database.your_table TO Disk('backups', 'retry_backup');Pause DDL and mutations before backing up. Ensure no schema changes are running:
-- Check for active mutations SELECT * FROM system.mutations WHERE is_done = 0; -- Check for active merges SELECT * FROM system.merges;Wait for mutations to complete before starting the backup:
-- Kill long-running mutations if acceptable KILL MUTATION WHERE database = 'your_database' AND table = 'your_table'; -- Then retry the backup BACKUP DATABASE your_database TO Disk('backups', 'clean_backup');Back up individual tables instead of the entire database. This reduces the window for metadata changes:
BACKUP TABLE your_database.table1, TABLE your_database.table2 TO Disk('backups', 'selective_backup');Let ClickHouse retry metadata collection. When a backup involves Keeper-coordinated tables, ClickHouse automatically retries the metadata snapshot when it detects a change. The default retry count (
backup_restore_keeper_max_retries) is 1000; if transient inconsistencies still persist, you can raise it along with the related backoff settings:BACKUP DATABASE your_database TO Disk('backups', 'backup_name') SETTINGS backup_restore_keeper_max_retries = 5000;Check system tables for anomalies:
-- Look for tables with inconsistent metadata SELECT database, table, engine, metadata_modification_time FROM system.tables WHERE database = 'your_database' ORDER BY metadata_modification_time DESC;For incremental backups, verify the base backup integrity. If using incremental backups, ensure the base backup is valid before attempting a new incremental.
Best Practices
- Schedule backups during low-activity periods to minimize the chance of concurrent metadata changes.
- Avoid running DDL operations (ALTER, DROP, RENAME) while backups are in progress.
- Wait for all active mutations to complete before starting a backup.
- Monitor backup success/failure and set up alerts for failed backups so gaps are caught quickly.
- Test backup restoration regularly to catch metadata issues before they become critical.
- For large databases, consider backing up tables individually or in small groups to reduce the backup window.
Frequently Asked Questions
Q: Is my data at risk if this error occurs?
A: No. This error only affects the backup operation. Your source data remains intact and unmodified. However, you should retry the backup promptly so you have a recent restore point.
Q: Can I use BACKUP ... ASYNC to avoid this issue?
A: Asynchronous backups face the same metadata consistency requirements. However, they may complete faster in some cases because they don't block the calling session. The same precautions about concurrent DDL apply.
Q: Does this error occur more often with ReplicatedMergeTree?
A: Yes. ReplicatedMergeTree tables have additional background activity (replication queue processing, merges coordinated through Keeper) that can change metadata during backup collection, making this error somewhat more likely compared to non-replicated tables.