clickhouse-backup: Tool Guide for ClickHouse Backups

clickhouse-backup is an open-source CLI for taking consistent, file-level backups of ClickHouse and pushing them to object storage. It uses ClickHouse's ALTER TABLE ... FREEZE mechanism to create hardlink snapshots of parts in the shadow/ directory, then packages and uploads them to S3, GCS, Azure Blob, or other remote backends. Compared to running BACKUP/RESTORE SQL by hand, the tool adds incremental uploads, retention, a REST API, and structured config. This guide covers installation, configuration, common operations, and remote storage.

Installing clickhouse-backup

Pick a release from the project's GitHub releases page and install either the static binary or a distribution package. Replace <version> with the version you want, for example 2.5.20.

# Static binary
wget https://github.com/<org>/clickhouse-backup/releases/download/v<version>/clickhouse-backup.tar.gz
tar zxf clickhouse-backup.tar.gz

# Debian/Ubuntu
wget https://github.com/<org>/clickhouse-backup/releases/download/v<version>/clickhouse-backup_<version>_amd64.deb
sudo dpkg -i clickhouse-backup_<version>_amd64.deb

# RHEL/CentOS/Fedora
sudo yum install https://github.com/<org>/clickhouse-backup/releases/download/v<version>/clickhouse-backup-<version>-1.x86_64.rpm

After install, generate a config template:

clickhouse-backup default-config > /etc/clickhouse-backup/config.yml

The binary must run on the same host as ClickHouse, because it needs local filesystem access to /var/lib/clickhouse/.

Configuration

Edit /etc/clickhouse-backup/config.yml. A minimal S3 configuration looks like this:

general:
  remote_storage: s3
  backups_to_keep_local: 3
  backups_to_keep_remote: 7
  upload_concurrency: 4
  download_concurrency: 4
  log_level: info

clickhouse:
  host: localhost
  port: 9000
  username: default
  password: ""
  timeout: 5m
  skip_tables:
    - system.*
    - INFORMATION_SCHEMA.*
    - information_schema.*

s3:
  access_key: "AKIA..."
  secret_key: "..."
  bucket: "ch-backups"
  region: us-east-1
  endpoint: ""
  acl: private
  compression_format: tar
  compression_level: 1
  part_size: 0

For non-AWS S3 compatible storage (Backblaze B2, MinIO, R2), set endpoint and, in some cases, acl: "" to disable ACL headers that those providers reject.

Google Cloud Storage

general:
  remote_storage: gcs

gcs:
  credentials_json: "/etc/clickhouse-backup/gcs-sa.json"
  bucket: "ch-backups"
  path: "prod/{cluster}/{shard}"
  storage_class: STANDARD
  compression_format: tar

Azure Blob Storage

general:
  remote_storage: azblob

azblob:
  account_name: "myaccount"
  account_key: "..."
  container: "ch-backups"
  endpoint_suffix: "core.windows.net"

Sensitive values can also be passed through environment variables, for example S3_ACCESS_KEY, S3_SECRET_KEY, CLICKHOUSE_PASSWORD, REMOTE_STORAGE. This is preferred for systemd units and Kubernetes deployments.

Core CLI commands

Command Purpose
create Freeze tables and create a local backup under /var/lib/clickhouse/backup/<name>
upload Push a local backup to remote storage
create_remote Combined create + upload in one step
download Pull a remote backup to local disk
restore Recreate schema and attach parts from a local backup
restore_remote Combined download + restore
list Show available local and remote backups
delete Remove a backup (local or remote scope)
tables Print databases and tables visible to the tool
watch Run a recurring full + incremental backup loop
server Run the REST API on :7171
clean Remove leftover shadow/ folders

Create and upload

sudo clickhouse-backup create bkp_$(date +%Y%m%d_%H%M%S)
sudo clickhouse-backup list local
sudo clickhouse-backup upload bkp_20260527_0100

To back up only specific tables, pass --tables:

sudo clickhouse-backup create --tables='analytics.*,billing.invoices' bkp_partial

create_remote is the typical production call because it avoids leaving large local copies behind:

sudo clickhouse-backup create_remote bkp_$(date +%Y%m%d_%H%M%S)

Restore

sudo clickhouse-backup download bkp_20260527_0100
sudo clickhouse-backup restore bkp_20260527_0100

Use --schema to restore only DDL, --data to restore only parts onto an existing schema, and --rm to drop existing tables first. For a one-shot remote restore:

sudo clickhouse-backup restore_remote --rm bkp_20260527_0100

Incremental backups

clickhouse-backup supports differential uploads by hardlinking unchanged parts to the previous remote backup instead of re-uploading them.

# Weekly full
sudo clickhouse-backup create_remote full_2026_w21

# Hourly differential against the most recent full
sudo clickhouse-backup create bkp_$(date +%H)
sudo clickhouse-backup upload --diff-from full_2026_w21 bkp_$(date +%H)

This dramatically reduces transfer cost on append-only workloads. See the dedicated guide on incremental backups for retention patterns and restoration.

Scheduling

The simplest scheduler is cron. A typical pattern: full backup weekly, differentials hourly.

0 2 * * 0  clickhouse-backup create_remote full_$(date +\%Y_w\%V)
0 * * * 1-6 clickhouse-backup create_remote --diff-from-remote full_$(date -d 'last sunday' +\%Y_w\%V) inc_$(date +\%Y\%m\%d_\%H)

Alternatively, use clickhouse-backup watch which runs a built-in loop, or the server subcommand to expose a REST API for an orchestrator (Argo, Airflow, Kubernetes CronJob) to trigger backups on a schedule.

Verifying backups

A backup that has never been restored is not a backup. At minimum:

  1. Run clickhouse-backup list remote weekly and check sizes are stable.
  2. Restore the latest backup to a staging node on a recurring schedule.
  3. Compare count() and sum(...) on a few tables against production.

Common Pitfalls

  • Backing up to a disk that holds ClickHouse data. A failed disk wipes both. Always upload to remote storage.
  • Forgetting skip_tables. Leaving system.* in scope inflates backups and can fail on system.text_log rotation.
  • Permissions. The tool needs read access to /var/lib/clickhouse and write access to backup/ and shadow/. Run as the clickhouse user or via sudo.
  • Replicated tables. clickhouse-backup only backs up the local replica. Run it on one replica per shard, or coordinate via the API so you do not duplicate work.
  • use_embedded_backup_restore: true. This switches to ClickHouse's native BACKUP/RESTORE engine and changes on-disk layout. Pick one mode and stick with it.
  • ATTACH PART errors after restore. Usually caused by missing force_restore_data flag or by restoring into a node that already has a different schema for the table.

Frequently Asked Questions

Q: Does clickhouse-backup stop the database? A: No. It uses ALTER TABLE ... FREEZE, which creates hardlinks without blocking writes. Reads and inserts continue while the backup runs.

Q: Is the backup consistent across tables? A: Within a single create call, all tables are frozen sequentially. There is no cross-table transaction in ClickHouse, so point-in-time consistency across tables is approximate, on the order of seconds.

Q: Where are local backups stored? A: Under /var/lib/clickhouse/backup/<backup_name>/. The directory contains metadata/ with DDL and shadow/ with hardlinked parts. Disk usage is small until parts diverge from the live data.

Q: Can I restore to a different ClickHouse version? A: Restoring to the same major version is supported. Restoring across major versions usually works for the data parts but may require schema adjustments. Test in staging first.

Q: Can I back up only one database? A: Yes. Use --tables='mydb.*' on create, or --tables='mydb.t1,mydb.t2' for specific tables. The same flag works on restore.

Q: How do I back up a ClickHouse cluster? A: Run clickhouse-backup on one replica per shard. Use a shared prefix in the remote path that includes shard and replica identifiers, for example path: "prod/{shard}", so restores can target the right node.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.