The "DB::Exception: Cannot fstat file" error in ClickHouse indicates that the server was unable to retrieve metadata (size, permissions, modification time) for a file using the fstat system call. This CANNOT_FSTAT error typically means the file descriptor is invalid or the underlying storage has become unreachable. ClickHouse uses fstat to verify file sizes and properties during reads and writes, so a failure here disrupts normal data operations.
Impact
When fstat fails, expect the following consequences:
- The operation referencing the file will be aborted
- Queries or merges that depend on knowing the file's size will fail
- Data integrity checks cannot verify file properties
- In rare cases, this may indicate broader storage problems affecting multiple files
Common Causes
- The file was deleted while ClickHouse held an open file descriptor to it
- Storage device failure making the file's metadata inaccessible
- The file descriptor became invalid due to an internal error or race condition
- Network filesystem (NFS, CIFS) disconnected, making remote files unreachable
- Corrupted filesystem inode that cannot return file attributes
- Running out of kernel memory for inode caching in extreme load scenarios
Troubleshooting and Resolution Steps
Check the error log for the affected file:
grep "Cannot fstat" /var/log/clickhouse-server/clickhouse-server.err.log | tail -5Verify the file still exists on disk:
ls -la /path/to/affected/file stat /path/to/affected/fileIf the file is missing, an external process may have deleted it.
Check filesystem and storage health:
dmesg | grep -i "error\|i/o\|fault" df -h /var/lib/clickhouseFor network storage, verify the mount is active:
mount | grep /var/lib/clickhouse stat /var/lib/clickhouse/dataA stale NFS mount may hang on stat calls. Consider unmounting and remounting.
Check open file descriptors:
ls -la /proc/$(pidof clickhouse-server)/fd/ | head -20 cat /proc/$(pidof clickhouse-server)/limits | grep "open files"Look for broken symlinks in the fd directory, which indicate deleted files still held open.
Restart ClickHouse to release any stale file descriptors and allow the server to reopen files cleanly.
Best Practices
- Do not delete files from the ClickHouse data directory while the server is running
- Use reliable local storage to avoid filesystem metadata issues common with network mounts
- Monitor storage device health and inode availability
- Keep ClickHouse updated to benefit from bug fixes in file handling code
- Use replicated tables for automatic recovery when parts become inaccessible
Frequently Asked Questions
Q: What information does fstat provide?
A: The fstat system call returns file metadata including file size, ownership, permissions, timestamps, and device information. ClickHouse uses this data to verify file sizes during reads, check part integrity, and manage data files.
Q: Can a deleted file cause this error even if ClickHouse has it open?
A: On Linux, a deleted file remains accessible through an open file descriptor until the descriptor is closed. However, certain operations on the descriptor may fail depending on the filesystem. In practice, deleting a file while ClickHouse uses it is risky and can trigger this error.
Q: Is this error related to CANNOT_OPEN_FILE?
A: They are different errors. CANNOT_OPEN_FILE occurs when a file cannot be opened at all, while CANNOT_FSTAT occurs after the file is already open but its metadata cannot be read. The underlying causes may overlap (e.g., storage failure), but they represent different points of failure.
Q: How urgent is this error?
A: It warrants prompt investigation. A single occurrence might be a transient issue, but repeated fstat failures suggest a storage problem that could escalate to data loss if not addressed.