ClickHouse DB::Exception: Too many open files

The "Too many open files" error in ClickHouse indicates that the server process has exhausted its available file descriptors. This system-level error (EMFILE) is raised when ClickHouse attempts to open a new file, socket, or pipe but has already reached the maximum number of open file descriptors allowed by the operating system.

Impact

When file descriptors are exhausted, ClickHouse cannot open new files for data parts, create new network connections, or perform any I/O operation requiring a new file descriptor. This can cause queries to fail, inserts to be rejected, replication to stall, and in severe cases, the server to become unresponsive. The issue affects all operations on the server, not just the query that triggered the error.

Common Causes

  1. The system file descriptor limit (ulimit -n) is set too low for the workload
  2. A large number of tables or data parts, each requiring open file descriptors for their files
  3. Many concurrent connections consuming file descriptors for sockets
  4. MergeTree tables with many small parts that have not been merged
  5. Excessive number of concurrent queries, each opening multiple data files
  6. File descriptor leak due to a bug (rare, but possible in older versions)
  7. System-wide file descriptor limit (/proc/sys/fs/file-max) is too low

Troubleshooting and Resolution Steps

  1. Check the current file descriptor limits for the ClickHouse process:

    cat /proc/$(pidof clickhouse-server)/limits | grep 'open files'
    
  2. Check how many file descriptors ClickHouse is currently using:

    ls /proc/$(pidof clickhouse-server)/fd | wc -l
    
  3. Check the system-wide file descriptor limit:

    cat /proc/sys/fs/file-max
    
  4. Increase the file descriptor limit for the ClickHouse user. Edit /etc/security/limits.d/clickhouse.conf:

    clickhouse soft nofile 1048576
    clickhouse hard nofile 1048576
    
  5. If using systemd, set the limit in the ClickHouse service file:

    sudo systemctl edit clickhouse-server
    

    Add:

    [Service]
    LimitNOFILE=1048576
    

    Then restart:

    sudo systemctl daemon-reload
    sudo systemctl restart clickhouse-server
    
  6. Increase the system-wide limit if needed:

    sudo sysctl -w fs.file-max=2097152
    # Make permanent in /etc/sysctl.conf:
    echo 'fs.file-max = 2097152' | sudo tee -a /etc/sysctl.conf
    
  7. Reduce the number of open files by merging small parts:

    -- Check tables with many parts
    SELECT database, table, count() AS part_count
    FROM system.parts
    WHERE active
    GROUP BY database, table
    ORDER BY part_count DESC
    LIMIT 20;
    
    -- Force merge to reduce part count
    OPTIMIZE TABLE your_database.your_table FINAL;
    
  8. Check for excessive connections:

    SELECT count() FROM system.processes;
    SELECT metric, value FROM system.metrics WHERE metric LIKE '%Connection%';
    

Best Practices

  • Set the file descriptor limit to at least 1,000,000 for production ClickHouse servers.
  • Monitor the number of open file descriptors relative to the limit and alert when usage exceeds 80%.
  • Regularly merge small parts by tuning MergeTree settings (parts_to_delay_insert, parts_to_throw_insert) to prevent part accumulation.
  • Use connection pooling in client applications to limit the number of simultaneous connections.
  • Drop unused tables and databases to reduce the baseline file descriptor usage.
  • Monitor system.metrics for the OpenFileForRead and OpenFileForWrite metrics.
  • Set file descriptor limits in the systemd service file rather than relying on /etc/security/limits.conf, as systemd may not read PAM limits.

Frequently Asked Questions

Q: What is a good file descriptor limit for ClickHouse?
A: For production servers, set it to at least 1,000,000 (1M). The default Linux limit of 1024 is far too low for ClickHouse. Large installations with many tables may need even higher limits. File descriptors are cheap; setting a high limit does not consume memory.

Q: How do I check the current usage without root access?
A: From within ClickHouse, you can check metrics:

SELECT metric, value FROM system.metrics WHERE metric LIKE '%File%' OR metric LIKE '%Connection%';
SELECT value FROM system.asynchronous_metrics WHERE metric = 'MaxPartCountForPartition';

Q: Will increasing the limit fix the problem permanently?
A: It addresses the immediate issue, but if your file descriptor usage is growing unboundedly (due to part accumulation or connection leaks), you should also address the root cause. Monitor usage over time to ensure it stabilizes after increasing the limit.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.