Code: 202. DB::Exception: Too many simultaneous queries. Maximum: 100. This ClickHouse error is raised when the count of concurrently executing queries on the server reaches max_concurrent_queries (default 100) and a new query arrives. The cluster is not out of resources - this is a deliberate admission-control limit. The result is that new query attempts are rejected with code 202 (TOO_MANY_SIMULTANEOUS_QUERIES) until existing queries finish.
What This Error Means
ClickHouse enforces two separate concurrency limits and they are easy to confuse:
max_concurrent_queries(default 100) caps the number of queries actively executing at any moment. When exceeded, the server returns code 202.max_connections(default 4096) caps the number of open TCP connections to the server. Excess connections are dropped at accept time with a separate error - the well-known "Too many simultaneous connections" phrasing maps to this limit.
Most "too many simultaneous" errors in modern ClickHouse versions are the query variant (code 202), even when users colloquially call them "connection" errors. The signal-distinguishing line is the Maximum: 100 (queries) versus Maximum: 4096 (connections) in the error text. ClickHouse can hold thousands of idle connections while only running 100 queries at once - clients sit in a queue, not in an error state, until a query slot frees up.
Common Causes
- Burst traffic from a service without connection pooling, where each request opens a fresh session and runs a query. Confirm with
SELECT count() FROM system.processesduring a burst - it will sit at 100. - Long-running queries holding slots open. Confirm with
SELECT query_id, elapsed, query FROM system.processes ORDER BY elapsed DESC- the long tail (multi-minute queries) is blocking shorter ones. - A misbehaving client that opens connections faster than it closes them, eventually saturating
max_connectionsand producing the literal "Too many simultaneous connections" wording. Confirm withnetstat -an | grep :9000 | grep ESTABLISHED | wc -l. - Sub-queries on a
Distributedtable fanning out one query per shard, multiplying the per-node query count. Each shard counts independently. - A BI tool with auto-refresh dashboards firing many parallel queries from many users. Confirm by grouping
system.query_logbyinitial_userover a 60-second window.
How to Fix Too Many Simultaneous Queries
Check current concurrency:
-- Active queries SELECT count() FROM system.processes; -- Connection-level metrics SELECT metric, value FROM system.metrics WHERE metric IN ('Query', 'TCPConnection', 'HTTPConnection', 'InterserverConnection');Identify long runners holding slots:
SELECT query_id, user, elapsed, memory_usage, query FROM system.processes ORDER BY elapsed DESC LIMIT 10;Kill blocking queries if they are misbehaving:
KILL QUERY WHERE query_id = '<id>';Raise
max_concurrent_queriesif the hardware can support more parallel work. Editconfig.xml:<max_concurrent_queries>200</max_concurrent_queries> <max_concurrent_select_queries>150</max_concurrent_select_queries> <max_concurrent_insert_queries>100</max_concurrent_insert_queries>Apply with
SYSTEM RELOAD CONFIG. Each running query consumesmax_threadscores worth of CPU, so raising this without raising core count just causes context switching.Set per-user concurrency limits via a settings profile so one tenant cannot saturate the slot pool:
CREATE SETTINGS PROFILE tenant_a SETTINGS max_concurrent_queries_for_user = 20 TO ROLE tenant_a_role;Implement client-side connection pooling and retries. Tools like
clickhouse-driver(Python),ClickHouse-go, and JDBC drivers all support pooling; clients should retry code 202 with exponential backoff.Reduce long-running queries: enable
max_execution_timeper user, run heavy analytics on a dedicated replica, or precompute via AggregatingMergeTree.
Root-Cause Analysis
The signal pattern over time reveals whether this is a hardware capacity issue or a client pathology:
-- Concurrency over the last hour, per minute
SELECT toStartOfMinute(event_time) AS m,
max(CurrentMetric_Query) AS max_concurrent_queries,
max(CurrentMetric_TCPConnection) AS max_connections
FROM system.metric_log
WHERE event_date = today()
GROUP BY m ORDER BY m DESC LIMIT 60;
-- Users with the highest concurrent query counts
SELECT initial_user, max(toStartOfMinute(event_time)) AS m,
count() AS queries_per_minute
FROM system.query_log
WHERE event_date >= today() - 1 AND type = 'QueryStart'
GROUP BY initial_user, toStartOfMinute(event_time)
ORDER BY queries_per_minute DESC LIMIT 20;
Preventive Measures
- Set
max_concurrent_queries_for_userper tenant via a settings profile. The default global cap is too coarse for multi-tenant clusters. - Set
max_execution_time(30-60 seconds for interactive workloads) so runaway queries do not pin slots indefinitely. - Enforce client-side connection pools with capped pool sizes. A pool of 10 connections from a service replica with 100-pod horizontal scaling can saturate a 100-slot ClickHouse cluster.
- Place a load balancer that round-robins across multiple replicas of each shard, distributing query slots evenly.
- For internal dashboards, use materialized views and AggregatingMergeTree to reduce per-query work, freeing slots for ad-hoc analysis.
Resolve TOO_MANY_SIMULTANEOUS_QUERIES (Code 202) Automatically with Pulse
Pulse is an AI DBA for ClickHouse (and Kafka and Elasticsearch). When Code: 202. DB::Exception: Too many simultaneous queries. Maximum: 100 fires and clients start getting rejected in your environment, Pulse:
- Continuously tracks
CurrentMetric_QueryandCurrentMetric_TCPConnectionfromsystem.metric_logagainstmax_concurrent_queries(default 100) andmax_connections(default 4096) - the two distinct limits the error name conflates - Correlates the slot saturation with long-tail queries in
system.processes, per-user query rates fromsystem.query_log, distributed sub-query fanout multiplying per-shard slot usage, and dashboard auto-refresh storms grouped byinitial_user - Identifies which of the five causes above applies - missing client pooling, multi-minute queries pinning slots, runaway distributed fanout, dashboard bursts, or genuine under-provisioning
- Recommends the precise fix - apply
max_concurrent_queries_for_userin a settings profile, enforcemax_execution_time = 30,KILL QUERYthe long-tail blockers, raisemax_concurrent_queriesto2 * cpu_cores, or splitmax_concurrent_select_queriesfrommax_concurrent_insert_queries - Applies low-risk fixes automatically with your approval (capping a runaway tenant via a settings profile during a burst, rolling back when traffic normalises) or generates a one-click config PR
Pulse turns the manual system.processes triage above into an agentic SRE workflow. Start a free trial.
Frequently Asked Questions
Q: What tool can automatically detect ClickHouse slot saturation before Code 202 starts rejecting queries?
A: Pulse is an AI DBA for ClickHouse that watches CurrentMetric_Query against max_concurrent_queries, projects sustained utilization above 70% (the leading indicator for Code 202 outages), and distinguishes between the four root causes - client connection storms, long-tail slow queries, distributed fanout amplification, and dashboard-driven bursts - then recommends or applies a tighter settings profile or max_execution_time cap.
Q: What is the difference between max_concurrent_queries and max_connections in ClickHouse?
A: max_concurrent_queries (default 100) limits queries actively executing at the same time; max_connections (default 4096) limits open TCP connections. Idle connections do not consume query slots, so a cluster can hold thousands of pooled connections while only executing 100 queries concurrently. Code 202 corresponds to the query limit; the literal "Too many simultaneous connections" message corresponds to max_connections.
Q: What is a good value for max_concurrent_queries?
A: A reasonable starting point is 2 * cpu_cores, since each query uses around max_threads worth of cores. Higher values cause CPU thrashing without proportional throughput. Adjust based on observed CPU saturation and the 95th percentile of max_threads per query.
Q: How do I see how many queries are running right now on ClickHouse?
A: Run SELECT count() FROM system.processes for active queries, or SELECT value FROM system.metrics WHERE metric = 'Query' for the same number via the metrics interface. SELECT * FROM system.processes ORDER BY elapsed DESC shows the actual queries.
Q: Does connection pooling fix this error?
A: Pooling reduces churn but does not directly increase the number of concurrent queries the server allows. If your pooled clients submit 200 queries in parallel against a max_concurrent_queries = 100 server, the 101st query still gets code 202. Combine pooling with client-side retry and either raise the limit or smooth the request rate.
Q: How do I limit one user from monopolizing all query slots?
A: Apply max_concurrent_queries_for_user through a settings profile targeted at the user or role. This caps that user's share of the global slot pool without lowering capacity for others.
Q: Are inserts and selects counted against the same limit?
A: By default, both count toward max_concurrent_queries. You can split them with max_concurrent_select_queries and max_concurrent_insert_queries, which independently limit each kind. Distributed sub-queries against this server also count.
Related Reading
- ClickHouse Settings Profile - per-user concurrency limits
- ClickHouse Memory Management - the other major resource cap
- Memory Limit Exceeded - co-occurs when many concurrent queries also stress RAM
- Read Timeout Error - what clients see when waiting for slots
- max_execution_time Setting - prevents long-running queries from pinning slots
- AggregatingMergeTree - precompute to reduce live query load
- ClickHouse Documentation Hub - index of all ClickHouse KB pages