The "DB::Exception: Filesystem metadata error" in ClickHouse signals that the server detected an inconsistency in filesystem-level metadata. This can involve mismatched file sizes, unexpected file states, or corruption in the metadata that ClickHouse maintains alongside data parts on disk. The error code is FS_METADATA_ERROR.
Impact
Affected tables or parts may become inaccessible. In severe cases, the server may refuse to load certain tables on startup. Write operations to the affected table will also be blocked until the metadata inconsistency is resolved.
Common Causes
- Unexpected server shutdown (power loss, OOM kill) leaving partially written metadata files.
- Disk hardware failures or filesystem corruption.
- Manual modification or deletion of files in the ClickHouse data directory.
- Running out of disk space during a write operation, resulting in truncated metadata files.
- Network filesystem (NFS, EFS) issues causing partial writes or stale caches.
- Bugs in the operating system's filesystem layer or storage driver.
Troubleshooting and Resolution Steps
Check the ClickHouse error log: The log will indicate which specific table, part, or file has the metadata inconsistency:
grep "FS_METADATA_ERROR" /var/log/clickhouse-server/clickhouse-server.err.logVerify filesystem integrity: Run a filesystem check if possible (requires unmounting or using a read-only check):
sudo fsck -n /dev/sdXCheck disk health: Look for hardware issues using SMART data:
sudo smartctl -a /dev/sdXDetach and reattach the affected table: This forces ClickHouse to re-read metadata from disk:
DETACH TABLE my_table; ATTACH TABLE my_table;Drop and recreate broken parts: If specific parts are corrupted, you can remove them:
ALTER TABLE my_table DROP PART 'part_name';For replicated tables, the part will be fetched from another replica automatically.
Check disk space: Ensure the data volume has sufficient free space:
df -h /var/lib/clickhouse/Restore from backup: If the metadata corruption is widespread, restoring from a recent backup may be the fastest resolution.
Best Practices
- Use reliable storage with redundancy (RAID, replicated volumes) for ClickHouse data directories.
- Monitor disk health and free space with automated alerting.
- Avoid manually modifying files in the ClickHouse data directory.
- Use ReplicatedMergeTree tables so that corrupted parts can be recovered from other replicas.
- Enable filesystem journaling (ext4, xfs) to reduce the risk of metadata corruption after unexpected shutdowns.
Frequently Asked Questions
Q: Can ClickHouse automatically recover from filesystem metadata errors?
A: For ReplicatedMergeTree tables, ClickHouse can fetch missing or corrupted parts from other replicas. For non-replicated tables, manual intervention or backup restoration is typically required.
Q: Should I run fsck on a live ClickHouse server?
A: Running fsck on a mounted filesystem is not recommended. Use fsck -n for a read-only check, or stop ClickHouse and unmount the volume before running a full check.
Q: Could this error be caused by using a network filesystem?
A: Yes. Network filesystems like NFS can introduce metadata inconsistencies due to caching, partial writes, or connection drops. ClickHouse generally recommends local storage for best reliability and performance.