NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

ClickHouse DB::Exception: Memory limit exceeded (Code: 241)

Code: 241. DB::Exception: Memory limit (for query) exceeded: would use X bytes (attempt to allocate chunk of Y bytes), maximum: Z bytes. This error is raised when a query, user, or server-wide memory tracker hits its configured limit, and ClickHouse kills the query to protect the rest of the process. The most common cause is a query whose intermediate state (hash JOIN, GROUP BY, ORDER BY) exceeds max_memory_usage (default 10 GB per query).

What This Error Means

ClickHouse tracks every allocation through a hierarchy of MemoryTracker objects - one per query, one per user, one for the whole server. Whichever tracker hits its limit first kills the offending query with error code 241 (MEMORY_LIMIT_EXCEEDED). The error message names the tracker that fired: (for query), (for user), or (total). The error is intentional - it prevents a single query from triggering the Linux OOM killer and taking down the whole server.

Memory cost in ClickHouse is dominated by blocking operators that buffer state proportional to input cardinality: hash JOIN builds the right-hand side in memory, GROUP BY materializes a hash table per distinct key, ORDER BY without LIMIT buffers the full result, and DISTINCT behaves like a GROUP BY. Streaming operators (filters, projections, simple aggregations like count()) use bounded memory.

Common Causes

  1. A query's GROUP BY on a high-cardinality key materializes a hash table larger than max_memory_usage. Confirm with EXPLAIN PIPELINE showing an AggregatingTransform and a system.query_log row with memory_usage near the limit.
  2. A hash JOIN builds a hash table over the larger table because the join order puts the big table on the right. Confirm by checking the join order in the query and the row counts of both sides.
  3. ORDER BY without LIMIT on a wide result. Confirm by reading the query text - any SELECT ... ORDER BY returning all rows is suspect.
  4. Per-user limit (max_memory_usage_for_user) is set lower than the per-query limit, and the user has multiple concurrent queries. The error message will say (for user) instead of (for query).
  5. The server tracker hit max_server_memory_usage because too many queries ran concurrently. Error message says (total). Often visible at the same time as a too many simultaneous connections backlog.
  6. A runaway materialized view feeds a heavy GROUP BY per insert. Confirm with system.query_log filtering on query_kind='Select' from the MV.

How to Fix Memory Limit Exceeded

  1. Identify which tracker fired. Look at the message: (for query), (for user), or (total). The fix depends on which one.

    SELECT query, memory_usage, peak_memory_usage, user, exception
    FROM system.query_log
    WHERE event_date >= today() - 1 AND exception_code = 241
    ORDER BY event_time DESC LIMIT 20;
    
  2. Raise the limit if the workload is legitimate. For a per-query limit, run SET max_memory_usage = 30000000000; (30 GB) before the query, or apply it via a settings profile. For the server, raise max_server_memory_usage in config.xml only if there is headroom in RAM.

  3. Enable disk spill for GROUP BY and ORDER BY. Set these below the memory limit (commonly half):

    SET max_bytes_before_external_group_by = 5000000000;  -- 5 GB
    SET max_bytes_before_external_sort     = 5000000000;
    

    The query will spill intermediate state to tmp_path instead of failing.

  4. Fix the join order. ClickHouse builds the hash table from the right side. Put the smaller table on the right:

    -- Bad: builds a hash on a 1B-row table
    SELECT ... FROM small s LEFT JOIN big b ON ...;
    -- Good: builds a hash on the small table
    SELECT ... FROM big b LEFT JOIN small s ON ...;
    
  5. Add a LIMIT or use a streaming format. If you only need the top-N or you're exporting all rows, LIMIT lets ORDER BY use a heap; FORMAT Native streams without buffering.

  6. Use partial aggregations or pre-aggregated rollups. Move heavy aggregations into an AggregatingMergeTree fed by a materialized view so the runtime query reads pre-computed states.

  7. Switch JOIN algorithm. Set join_algorithm = 'partial_merge' or 'grace_hash' for large joins; these trade CPU for memory and can spill to disk.

Root-Cause Analysis

Investigating which query, user, or operator is responsible:

-- Top memory consumers in the last day
SELECT user, query_kind, max(peak_memory_usage) AS peak,
       count() AS runs, any(query) AS sample
FROM system.query_log
WHERE event_date >= today() - 1 AND type = 'QueryFinish'
GROUP BY user, query_kind
ORDER BY peak DESC LIMIT 10;

-- Which operator dominated memory in a specific query
SELECT * FROM system.processes WHERE query_id = '<id>';

-- Server-wide memory trend (1-minute samples)
SELECT event_time, CurrentMetric_MemoryTracking
FROM system.metric_log
WHERE event_date = today()
ORDER BY event_time DESC LIMIT 60;

Preventive Measures

  • Set max_memory_usage per query and max_memory_usage_for_user per user via a settings profile. Never leave them at unlimited in multi-tenant clusters.
  • Enable max_bytes_before_external_group_by and max_bytes_before_external_sort cluster-wide at half of max_memory_usage. Even if you rarely spill, having the option avoids hard failures.
  • Reserve at least 2 GB of RAM headroom above max_server_memory_usage for the page cache and the kernel.
  • Watch system.metric_log for the MemoryTracking metric trend - sustained values above 70% of the limit predict 241 errors.
  • Reduce per-query memory pressure with LowCardinality types on string columns, CODEC(ZSTD) on rarely-read columns, and tighter PRIMARY KEY prefixes on MergeTree tables.

Resolve MEMORY_LIMIT_EXCEEDED Automatically with Pulse

Pulse is an AI DBA for ClickHouse (and Kafka and Elasticsearch). When Code: 241. DB::Exception: Memory limit (for query|for user|total) exceeded fires in your environment, Pulse:

  • Continuously tracks CurrentMetric_MemoryTracking from system.metric_log, per-query peak_memory_usage, and per-user concurrent memory across the three tracker levels (max_memory_usage, max_memory_usage_for_user, max_server_memory_usage)
  • Correlates the trip with the offending query_id in system.query_log, the operator that dominated allocation (hash JOIN build, GROUP BY hash table, ORDER BY buffer), concurrent query count, and a possible co-occurring too many simultaneous queries backlog
  • Identifies which of the six causes above applies - high-cardinality GROUP BY, inverted join order with the big table on the right, ORDER BY without LIMIT, per-user cap saturation, server tracker exhaustion, or a runaway materialized view
  • Recommends the precise fix - enable max_bytes_before_external_group_by and max_bytes_before_external_sort at half the limit, swap join order, switch join_algorithm to partial_merge or grace_hash, tighten a settings profile, or route heavy aggregations into an AggregatingMergeTree
  • Applies low-risk fixes automatically with your approval (tightening max_memory_usage_for_user on a tenant approaching saturation, enabling external spill) or generates a one-click settings PR

Pulse turns the manual system.query_log triage above into an agentic SRE workflow. Start a free trial.

Frequently Asked Questions

Q: What is the fastest way to diagnose MEMORY_LIMIT_EXCEEDED (Code 241) in production ClickHouse?
A: Identify which tracker fired from the message - (for query), (for user), or (total) - then query system.query_log for the offending query_id and check peak_memory_usage. For continuous coverage, Pulse is an AI DBA for ClickHouse that detects Code 241 events in real time, correlates them with the query, operator, and concurrent-query count that caused the trip, and recommends the right fix - external GROUP BY spill, join-order swap, or a tighter settings profile.

Q: What does "Code: 241 DB::Exception: Memory limit exceeded" mean in ClickHouse?
A: Code 241 (MEMORY_LIMIT_EXCEEDED) is raised when a ClickHouse memory tracker - query, user, or server-wide - exceeds its configured limit. The query is killed before it can trigger an OS-level OOM. The error message indicates which tracker fired with (for query), (for user), or (total).

Q: How do I increase max_memory_usage in ClickHouse for a single query?
A: Run SET max_memory_usage = 20000000000; (20 GB) in the session before the query, or use the inline form SELECT ... SETTINGS max_memory_usage = 20000000000. The new value must still fit inside max_memory_usage_for_user and max_server_memory_usage.

Q: Can I make ClickHouse spill GROUP BY to disk instead of failing?
A: Yes. Set max_bytes_before_external_group_by to a value smaller than max_memory_usage (commonly half). When the in-memory hash table reaches that size, ClickHouse writes partial state to tmp_path and merges from disk. The same pattern works for sorting via max_bytes_before_external_sort.

Q: Why does the error say "for user" instead of "for query"?
A: ClickHouse tracks memory at three levels: per query (max_memory_usage, default 10 GB), per user (max_memory_usage_for_user, default unlimited), and per server (max_server_memory_usage, default ~90% of RAM). When (for user) fires, the user's concurrent queries together exceeded the per-user cap, even though no single query did. Raise max_memory_usage_for_user or limit concurrency for that user.

Q: Does max_memory_usage = 0 mean unlimited?
A: Yes, 0 means no per-query limit, and the query is then only bounded by the user and server trackers. Setting it to 0 is rarely recommended in production - a single query without a limit can starve other workloads.

Q: How can I prevent this error from happening again?
A: Set per-query and per-user limits via a settings profile, enable external GROUP BY and ORDER BY spill, fix join order so the smaller table is on the right, and route heavy aggregations through AggregatingMergeTree. Monitoring trends in MemoryTracking is the most reliable way to catch problems before they fire.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.