ClickHouse DB::Exception: Inconsistent metadata for backup (Code: 663)

The "DB::Exception: Inconsistent metadata for backup" error in ClickHouse occurs when the backup process detects that the metadata it has collected is internally inconsistent. The error code is INCONSISTENT_METADATA_FOR_BACKUP (code 663). This typically means that the table structures, dependencies, or part references changed while the backup metadata was being assembled, resulting in a state that cannot be safely backed up.

Impact

The backup operation fails and no backup is produced. This can leave your backup schedule with a gap, which is particularly concerning for disaster recovery. If backups fail repeatedly, you risk not having a recent restore point. The original data on the server is not affected by this error -- only the backup process is interrupted.

Common Causes

  1. Concurrent DDL operations during backup -- Running ALTER TABLE, DROP TABLE, RENAME, or other schema-changing operations while a backup is in progress can cause metadata to shift mid-backup.
  2. Active mutations -- Ongoing ALTER TABLE ... UPDATE or ALTER TABLE ... DELETE mutations can change the set of parts and their metadata during backup collection.
  3. Replication lag or conflicts -- On ReplicatedMergeTree tables, replication activity can modify parts and metadata concurrently with the backup.
  4. Rapid merge activity -- Aggressive background merges can replace parts that the backup process has already catalogued, causing references to become stale.
  5. Dependent objects out of sync -- Views, dictionaries, or other objects that reference tables being backed up may have metadata that doesn't match the current table state.
  6. Partial previous backup state -- Incremental backups that reference a base backup with inconsistent or corrupted metadata.

Troubleshooting and Resolution Steps

  1. Retry the backup. Transient metadata inconsistencies caused by concurrent operations may not recur:

    BACKUP TABLE your_database.your_table TO Disk('backups', 'retry_backup');
    
  2. Pause DDL and mutations before backing up. Ensure no schema changes are running:

    -- Check for active mutations
    SELECT * FROM system.mutations WHERE is_done = 0;
    
    -- Check for active merges
    SELECT * FROM system.merges;
    
  3. Wait for mutations to complete before starting the backup:

    -- Kill long-running mutations if acceptable
    KILL MUTATION WHERE database = 'your_database' AND table = 'your_table';
    
    -- Then retry the backup
    BACKUP DATABASE your_database TO Disk('backups', 'clean_backup');
    
  4. Back up individual tables instead of the entire database. This reduces the window for metadata changes:

    BACKUP TABLE your_database.table1, TABLE your_database.table2
    TO Disk('backups', 'selective_backup');
    
  5. Let ClickHouse retry metadata collection. When a backup involves Keeper-coordinated tables, ClickHouse automatically retries the metadata snapshot when it detects a change. The default retry count (backup_restore_keeper_max_retries) is 1000; if transient inconsistencies still persist, you can raise it along with the related backoff settings:

    BACKUP DATABASE your_database TO Disk('backups', 'backup_name')
    SETTINGS backup_restore_keeper_max_retries = 5000;
    
  6. Check system tables for anomalies:

    -- Look for tables with inconsistent metadata
    SELECT database, table, engine, metadata_modification_time
    FROM system.tables
    WHERE database = 'your_database'
    ORDER BY metadata_modification_time DESC;
    
  7. For incremental backups, verify the base backup integrity. If using incremental backups, ensure the base backup is valid before attempting a new incremental.

Best Practices

  • Schedule backups during low-activity periods to minimize the chance of concurrent metadata changes.
  • Avoid running DDL operations (ALTER, DROP, RENAME) while backups are in progress.
  • Wait for all active mutations to complete before starting a backup.
  • Monitor backup success/failure and set up alerts for failed backups so gaps are caught quickly.
  • Test backup restoration regularly to catch metadata issues before they become critical.
  • For large databases, consider backing up tables individually or in small groups to reduce the backup window.

Frequently Asked Questions

Q: Is my data at risk if this error occurs?
A: No. This error only affects the backup operation. Your source data remains intact and unmodified. However, you should retry the backup promptly so you have a recent restore point.

Q: Can I use BACKUP ... ASYNC to avoid this issue?
A: Asynchronous backups face the same metadata consistency requirements. However, they may complete faster in some cases because they don't block the calling session. The same precautions about concurrent DDL apply.

Q: Does this error occur more often with ReplicatedMergeTree?
A: Yes. ReplicatedMergeTree tables have additional background activity (replication queue processing, merges coordinated through Keeper) that can change metadata during backup collection, making this error somewhat more likely compared to non-replicated tables.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.