The "DB::Exception: Cannot write to file" error in ClickHouse is a general-purpose write failure that surfaces when the server is unable to write data to a file on disk. The CANNOT_WRITE_TO_FILE error code encompasses a broad range of scenarios, from full disks to hardware failures. Since nearly every ClickHouse operation involves writing files at some point -- inserts write data parts, merges produce new parts, and queries write temporary results -- this error can appear in many different contexts.
Impact
A file write failure has wide-reaching consequences:
- Active inserts are aborted, stopping data ingestion
- Background merges cannot produce output parts, leading to part accumulation
- Queries that spill to disk for sorting or joining will fail
- Mutations cannot write their results, leaving the ALTER operation incomplete
- If the issue is system-wide (e.g., full disk), all write operations across all tables are affected
Common Causes
- Disk space exhaustion -- the most frequent cause
- Inode exhaustion on the filesystem
- The ClickHouse process lacks write permissions on the target file or directory
- Filesystem remounted as read-only after detecting errors
- Underlying storage device failure or disconnection
- Disk quota exceeded for the ClickHouse user
- File descriptor limit reached
- SELinux or AppArmor denying write access
Troubleshooting and Resolution Steps
Check disk space first (most common cause):
df -h /var/lib/clickhouse df -i /var/lib/clickhouseIf the disk is full, free up space immediately:
-- Drop old partitions ALTER TABLE your_db.your_table DROP PARTITION 'old_partition'; -- Or truncate tables you no longer need TRUNCATE TABLE your_db.temp_table;Verify filesystem state:
mount | grep $(df /var/lib/clickhouse --output=source | tail -1)Look for
roindicating read-only. Fix the underlying error and remount.Check permissions:
sudo -u clickhouse touch /var/lib/clickhouse/write_test && rm /var/lib/clickhouse/write_testFix ownership if the test fails:
sudo chown -R clickhouse:clickhouse /var/lib/clickhouseLook at system-level I/O errors:
dmesg | grep -i "error\|i/o\|scsi"Review file descriptor usage:
cat /proc/$(pidof clickhouse-server)/limits | grep "open files" ls /proc/$(pidof clickhouse-server)/fd | wc -lCheck for disk quotas:
repquota /var/lib/clickhouse 2>/dev/null quota -u clickhouse 2>/dev/nullInspect security policies:
sudo ausearch -m avc -ts recentOnce the root cause is resolved, restart ClickHouse to resume normal operations. Pending merges and mutations will be rescheduled automatically.
Best Practices
- Set up disk space alerting at 75-80% capacity with a critical alert at 90%
- Use ClickHouse's
min_free_disk_spacesetting to halt writes before the disk is completely full - Separate the ClickHouse data volume from the OS volume to prevent system services from competing for space
- Implement data retention policies using TTL to automatically remove old data
- Monitor inode usage alongside disk space, especially for tables with many small parts
- Run ClickHouse on ext4 or xfs with default mount options for reliable write behavior
- Keep multiple replicas so that a write failure on one node does not block data ingestion entirely
Frequently Asked Questions
Q: My disk was full but I freed space. Do I need to restart ClickHouse?
A: Not always. ClickHouse will retry background operations like merges automatically. However, if queries or inserts continue to fail, a restart ensures a clean state. Check that inserts succeed before relying on automatic recovery alone.
Q: How can I prevent the disk from filling up in the first place?
A: Use TTL rules on tables to expire old data, implement partition-based retention, monitor disk usage proactively, and configure min_free_disk_space in ClickHouse storage policies to reserve space for merges.
Q: Is this error the same as CANNOT_WRITE_TO_FILE_DESCRIPTOR?
A: They are related but distinct. CANNOT_WRITE_TO_FILE is a higher-level error tied to ClickHouse's file I/O abstraction, while CANNOT_WRITE_TO_FILE_DESCRIPTOR is a lower-level error from the raw write() system call. The troubleshooting steps overlap significantly.
Q: Can network filesystems cause intermittent write failures?
A: Absolutely. NFS and other network filesystems are susceptible to temporary connectivity issues that manifest as write errors. For production ClickHouse deployments, local storage is strongly recommended.