NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

ClickHouse DB::Exception: Cannot write to file

The "DB::Exception: Cannot write to file" error in ClickHouse is a general-purpose write failure that surfaces when the server is unable to write data to a file on disk. The CANNOT_WRITE_TO_FILE error code encompasses a broad range of scenarios, from full disks to hardware failures. Since nearly every ClickHouse operation involves writing files at some point -- inserts write data parts, merges produce new parts, and queries write temporary results -- this error can appear in many different contexts.

Impact

A file write failure has wide-reaching consequences:

  • Active inserts are aborted, stopping data ingestion
  • Background merges cannot produce output parts, leading to part accumulation
  • Queries that spill to disk for sorting or joining will fail
  • Mutations cannot write their results, leaving the ALTER operation incomplete
  • If the issue is system-wide (e.g., full disk), all write operations across all tables are affected

Common Causes

  1. Disk space exhaustion -- the most frequent cause
  2. Inode exhaustion on the filesystem
  3. The ClickHouse process lacks write permissions on the target file or directory
  4. Filesystem remounted as read-only after detecting errors
  5. Underlying storage device failure or disconnection
  6. Disk quota exceeded for the ClickHouse user
  7. File descriptor limit reached
  8. SELinux or AppArmor denying write access

Troubleshooting and Resolution Steps

  1. Check disk space first (most common cause):

    df -h /var/lib/clickhouse
    df -i /var/lib/clickhouse
    

    If the disk is full, free up space immediately:

    -- Drop old partitions
    ALTER TABLE your_db.your_table DROP PARTITION 'old_partition';
    -- Or truncate tables you no longer need
    TRUNCATE TABLE your_db.temp_table;
    
  2. Verify filesystem state:

    mount | grep $(df /var/lib/clickhouse --output=source | tail -1)
    

    Look for ro indicating read-only. Fix the underlying error and remount.

  3. Check permissions:

    sudo -u clickhouse touch /var/lib/clickhouse/write_test && rm /var/lib/clickhouse/write_test
    

    Fix ownership if the test fails:

    sudo chown -R clickhouse:clickhouse /var/lib/clickhouse
    
  4. Look at system-level I/O errors:

    dmesg | grep -i "error\|i/o\|scsi"
    
  5. Review file descriptor usage:

    cat /proc/$(pidof clickhouse-server)/limits | grep "open files"
    ls /proc/$(pidof clickhouse-server)/fd | wc -l
    
  6. Check for disk quotas:

    repquota /var/lib/clickhouse 2>/dev/null
    quota -u clickhouse 2>/dev/null
    
  7. Inspect security policies:

    sudo ausearch -m avc -ts recent
    
  8. Once the root cause is resolved, restart ClickHouse to resume normal operations. Pending merges and mutations will be rescheduled automatically.

Best Practices

  • Set up disk space alerting at 75-80% capacity with a critical alert at 90%
  • Use ClickHouse's min_free_disk_space setting to halt writes before the disk is completely full
  • Separate the ClickHouse data volume from the OS volume to prevent system services from competing for space
  • Implement data retention policies using TTL to automatically remove old data
  • Monitor inode usage alongside disk space, especially for tables with many small parts
  • Run ClickHouse on ext4 or xfs with default mount options for reliable write behavior
  • Keep multiple replicas so that a write failure on one node does not block data ingestion entirely

Frequently Asked Questions

Q: My disk was full but I freed space. Do I need to restart ClickHouse?
A: Not always. ClickHouse will retry background operations like merges automatically. However, if queries or inserts continue to fail, a restart ensures a clean state. Check that inserts succeed before relying on automatic recovery alone.

Q: How can I prevent the disk from filling up in the first place?
A: Use TTL rules on tables to expire old data, implement partition-based retention, monitor disk usage proactively, and configure min_free_disk_space in ClickHouse storage policies to reserve space for merges.

Q: Is this error the same as CANNOT_WRITE_TO_FILE_DESCRIPTOR?
A: They are related but distinct. CANNOT_WRITE_TO_FILE is a higher-level error tied to ClickHouse's file I/O abstraction, while CANNOT_WRITE_TO_FILE_DESCRIPTOR is a lower-level error from the raw write() system call. The troubleshooting steps overlap significantly.

Q: Can network filesystems cause intermittent write failures?
A: Absolutely. NFS and other network filesystems are susceptible to temporary connectivity issues that manifest as write errors. For production ClickHouse deployments, local storage is strongly recommended.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.