The DB::Exception: Not found expected data part error (code NOT_FOUND_EXPECTED_DATA_PART) appears when ClickHouse discovers that a data part recorded in ZooKeeper as belonging to this replica is absent from the local filesystem. The coordination layer says the part should be here, but the disk disagrees.
Impact
The affected replica cannot serve queries that depend on the missing part, which may lead to incomplete results. Replication integrity is compromised for this table on this node. Merges that depend on the missing part will also fail, creating a backlog in the replication queue that grows over time.
Common Causes
- Disk failure or filesystem corruption that silently removed or damaged the part directory.
- Manual deletion of data files from the ClickHouse data directory.
- Out-of-space condition that caused a partial write, followed by ClickHouse cleaning up the incomplete part.
- Filesystem snapshot restore that rolled back the disk to a state older than ZooKeeper's metadata.
- Bug during a merge or mutation that deleted the source part before recording the new part.
Troubleshooting and Resolution Steps
Identify which parts are missing Check the ClickHouse server log for the specific part names. You can also query:
SELECT new_part_name, last_exception FROM system.replication_queue WHERE last_exception LIKE '%Not found expected%';Check if the part was detached
SELECT * FROM system.detached_parts WHERE table = 'my_table' AND name = 'expected_part_name';If found, re-attach it:
ALTER TABLE db.my_table ATTACH PART 'expected_part_name';Verify disk health
dmesg | grep -i error smartctl -a /dev/sda df -h /var/lib/clickhouseAddress any hardware or filesystem issues before proceeding.
Fetch the part from another replica Restart replication to trigger a fresh fetch:
SYSTEM RESTART REPLICA db.my_table; SYSTEM SYNC REPLICA db.my_table;ClickHouse will realize the part is missing locally and attempt to download it from a peer.
Restore from backup If no other replica has the part either:
clickhouse-backup restore --table db.my_table --partitions 'YYYYMM'Drop and recreate the table as a last resort If the table is beyond repair on this node:
DROP TABLE db.my_table SYNC; -- Recreate with the same engine definition CREATE TABLE db.my_table (...) ENGINE = ReplicatedMergeTree(...) ...;The new replica will clone all parts from healthy peers.
Best Practices
- Use RAID or replicated storage to protect against single-disk failures.
- Monitor disk health with SMART tools and filesystem integrity checks.
- Never manually delete files from the ClickHouse data directory; use SQL commands instead.
- Maintain regular backups and validate them by performing test restores.
- Set up monitoring on
system.replication_queueto catch missing-part errors quickly.
Frequently Asked Questions
Q: Can ClickHouse recover the missing part automatically?
A: In many cases, yes. If another replica has the part, ClickHouse will fetch it once replication detects the gap. Running SYSTEM RESTART REPLICA can accelerate this process.
Q: Does this error indicate data loss?
A: On the affected replica, yes -- the local copy is gone. However, if other replicas have the part, the data can be recovered through replication.
Q: Should I be concerned about the disk if this error appears once?
A: A single occurrence could be a fluke (e.g., power loss during a write). Repeated occurrences strongly suggest a hardware or filesystem problem that needs investigation.
Q: How is this different from NO_SUCH_DATA_PART?
A: NOT_FOUND_EXPECTED_DATA_PART specifically means the part is expected according to ZooKeeper but missing from disk. NO_SUCH_DATA_PART is a more general error for referencing a part that does not exist in the table's metadata.