ClickHouse Memory Overcommitter: How It Works

The Memory Overcommitter, available since ClickHouse 22.2, replaces hard per-query memory ceilings with soft limits and a runtime arbitrator. Queries can claim more memory than their soft limit when the server has spare RAM. If the server actually hits its global cap, the overcommit tracker identifies the most overcommitted query and terminates it, letting the rest finish. This article explains the model, the settings that control it, and when fixed limits are still the better choice.

How the overcommitter works

The flow is straightforward:

Queries declare a soft memory limit through memory_overcommit_ratio_denominator (per query) or memory_overcommit_ratio_denominator_for_user (per user).
While total server memory is below max_server_memory_usage, any query can allocate freely, even past its soft limit.
When the server reaches the global cap, the overcommit tracker selects the query whose actual usage exceeds its soft limit by the highest ratio, sends it a stop signal, and waits up to memory_usage_overcommit_max_wait_microseconds.
Other queries pause briefly while the offender releases memory, then resume.
If the chosen query does not free memory within the wait window, the tracker throws to the caller.

The result is that small queries are protected and a single oversized query rarely takes down the whole server.

Required configuration

Enable it by zeroing the hard per-query and per-user limits and relying on the server-wide cap:

<clickhouse>
    <max_server_memory_usage>0</max_server_memory_usage>
    <max_server_memory_usage_to_ram_ratio>0.8</max_server_memory_usage_to_ram_ratio>
    <profiles>
        <default>
            <max_memory_usage>0</max_memory_usage>
            <max_memory_usage_for_user>0</max_memory_usage_for_user>
            <memory_overcommit_ratio_denominator>1073741824</memory_overcommit_ratio_denominator>
            <memory_overcommit_ratio_denominator_for_user>1073741824</memory_overcommit_ratio_denominator_for_user>
            <memory_usage_overcommit_max_wait_microseconds>200000</memory_usage_overcommit_max_wait_microseconds>
        </default>
    </profiles>
</clickhouse>

max_memory_usage = 0 disables the hard cap so the overcommitter can govern. max_server_memory_usage_to_ram_ratio = 0.8 keeps the kernel page cache from being crushed.

Settings reference

Setting	Scope	Meaning
`memory_overcommit_ratio_denominator`	Per query	Soft memory ceiling used to compute overcommit ratio (`actual / denominator`)
`memory_overcommit_ratio_denominator_for_user`	Per user	Soft ceiling for a user's combined queries
`memory_usage_overcommit_max_wait_microseconds`	Per query	How long the tracker waits for the killed query to release memory before throwing

The ratio is the key: a query using 4 GB with a 1 GB denominator has ratio 4. A query using 2 GB with a 4 GB denominator has ratio 0.5. The 4-ratio query is killed first.

Verifying it is active

Check that hard limits are zeroed and the overcommit settings are visible:

SELECT name, value
FROM system.settings
WHERE name IN (
    'max_memory_usage',
    'max_memory_usage_for_user',
    'memory_overcommit_ratio_denominator',
    'memory_overcommit_ratio_denominator_for_user',
    'memory_usage_overcommit_max_wait_microseconds'
);

SELECT name, value
FROM system.server_settings
WHERE name LIKE '%server_memory%';

Look for kills in the query log:

SELECT
    query_id,
    user,
    formatReadableSize(memory_usage) AS used,
    exception
FROM system.query_log
WHERE event_date = today()
  AND exception ILIKE '%overcommit%'
ORDER BY event_time DESC
LIMIT 20;

When it helps

Good fit:

Mixed workloads with many small interactive queries plus occasional heavy aggregations.
Ad-hoc analytics clusters where users sometimes need more memory than they normally consume.
Single-tenant clusters where the overall budget matters more than per-query fairness.

When fixed limits are better

The overcommitter is a poor fit for:

ETL pipelines like dbt that fire hundreds of small to medium queries. The tracker may kill an essential pipeline step when an unrelated heavy job runs.
Multi-tenant clusters where one user's heavy query must not starve others. Hard max_memory_usage_for_user per profile gives clearer isolation.
Workloads with strict latency SLAs. The brief pause while the tracker kills a query adds tail latency.

For these, set per-query and per-user caps explicitly and leave the overcommitter disabled.

Common Pitfalls

Leaving max_memory_usage non-zero. The hard cap still applies and the overcommitter never gets a chance to arbitrate.
Using a very small memory_usage_overcommit_max_wait_microseconds. Killed queries do not release memory instantly, and a short wait leads to throws on the surviving queries.
Setting denominators identical to expected usage. The ratio then sits near 1 for all queries and the tracker cannot pick a clear offender.
Ignoring max_server_memory_usage_to_ram_ratio. Without it, the OOM killer may strike before the overcommitter triggers.
Combining overcommit with cgroup memory limits set too tight. The kernel kills the server before the tracker reacts.

Frequently Asked Questions

Q: Does the overcommitter replace max_server_memory_usage? A: No. The server-wide cap still defines when the tracker engages. Without it, allocations continue until the OS kills the process.

Q: Can I enable it for specific users only? A: Yes. Set the overcommit denominators in their settings profile and leave max_memory_usage non-zero for other profiles.

Q: How is the "most overcommitted" query chosen? A: By the largest ratio of current memory usage to that query's memory_overcommit_ratio_denominator. Higher ratio means it consumed far more than it declared.

Q: Will the killed query be retried automatically? A: No. The client receives an exception. Application code or workflow tools must retry.

Q: Is the overcommitter compatible with the Memory Tracker? A: Yes; it is built on top of it. The tracker continues to account memory; the overcommitter changes which limit triggers kill versus reject.