The "DB::Exception: TCP connection limit reached" error in ClickHouse occurs when the server has reached its configured maximum number of simultaneous TCP connections on the native protocol port. The error code is TCP_CONNECTION_LIMIT_REACHED. Once this limit is hit, new connection attempts on the native port are rejected until existing connections are closed.
Impact
New clients attempting to connect via the native protocol will be refused, causing query failures for applications, monitoring tools, and interactive clients. Existing connections that are already established continue to function. This can lead to partial outages where some application instances work while others cannot connect.
Common Causes
- The
max_connectionssetting in the server configuration is too low for the workload. - Connection leaks in applications that open connections but fail to close them properly.
- A spike in concurrent client connections, such as during peak traffic or batch processing windows.
- Idle connections accumulating from applications that maintain persistent connections without a timeout.
- Distributed queries in a cluster causing nodes to open many inter-node connections simultaneously.
Troubleshooting and Resolution Steps
Check the current number of active TCP connections:
SELECT * FROM system.metrics WHERE metric = 'TCPConnection';Review the configured connection limit in
config.xml:<max_connections>4096</max_connections>If the limit is too low, increase it to match your workload requirements:
<max_connections>8192</max_connections>Identify which clients are holding the most connections:
SELECT client_name, count() AS connection_count FROM system.processes GROUP BY client_name ORDER BY connection_count DESC;Check for connection leaks in your applications. Ensure that every connection is properly closed after use, and that connection pools have appropriate maximum sizes and idle timeouts.
Set a server-side idle connection timeout to reclaim connections from unresponsive clients:
<keep_alive_timeout>60</keep_alive_timeout>Verify that the operating system's file descriptor limit is high enough to support the configured
max_connectionsvalue:ulimit -n
Best Practices
- Size the
max_connectionssetting based on your expected peak concurrency plus a reasonable buffer. - Use connection pooling in all applications that connect to ClickHouse, with sensible pool size limits.
- Set idle connection timeouts both on the server and in client connection pools to prevent stale connections from consuming slots.
- Monitor the
TCPConnectionmetric and set alerts that trigger before the limit is reached. - Ensure the OS-level file descriptor limits are high enough to accommodate the configured connection maximum.
Frequently Asked Questions
Q: What is the default value for max_connections?
A: The default value is 4096. Depending on your workload and hardware, you may need to increase this.
Q: Does this limit affect HTTP connections as well?
A: No. TCP and HTTP connections have separate limits. The max_connections setting controls the native TCP protocol. For HTTP connections, see the HTTP_CONNECTION_LIMIT_REACHED error and the corresponding max_concurrent_queries or HTTP-specific settings.
Q: How can I monitor connection usage over time?
A: Query the system.metrics table for TCPConnection or use the system.asynchronous_metrics and system.events tables. Feeding these into a monitoring system like Grafana provides historical visibility.
Q: Will increasing max_connections affect memory usage?
A: Each connection consumes a small amount of memory for its state. Thousands of idle connections have a modest overhead, but if many connections are actively executing queries, memory usage can grow substantially. Balance the connection limit with available system resources.