ClickHouse DB::Exception: Cannot allocate memory

The "DB::Exception: Cannot allocate memory" error in ClickHouse signals an OS-level memory allocation failure. Unlike ClickHouse's own memory limit errors, the CANNOT_ALLOCATE_MEMORY error means the operating system itself refused to provide more memory to the ClickHouse process. This typically happens when the system is genuinely out of RAM, or when a cgroup memory limit has been reached in containerized environments.

Impact

When this error occurs, queries will fail immediately and ClickHouse may become unstable. In severe cases, the Linux OOM killer may terminate the ClickHouse process entirely, leading to downtime for all connected clients. Background operations such as merges and mutations can also be affected, potentially leaving the server in a degraded state until memory pressure is relieved.

Common Causes

The host machine has exhausted all available physical RAM and swap space
A cgroup or container memory limit (e.g., Docker --memory flag, Kubernetes resource limits) is too restrictive for the workload
Multiple memory-intensive queries running concurrently without proper max_memory_usage limits
ClickHouse's max_server_memory_usage is set higher than what the OS or container can actually provide
Other processes on the same host are consuming significant memory, leaving insufficient resources for ClickHouse
Memory fragmentation preventing large contiguous allocations even when total free memory appears sufficient

Troubleshooting and Resolution Steps

Check system memory status:
```
free -h
cat /proc/meminfo
```
Look at available memory and swap usage. If both are nearly exhausted, the cause is clear.

Check cgroup limits (for containerized deployments):

cat /sys/fs/cgroup/memory/memory.limit_in_bytes
cat /sys/fs/cgroup/memory/memory.usage_in_bytes

On cgroup v2:

cat /sys/fs/cgroup/memory.max
cat /sys/fs/cgroup/memory.current

Review ClickHouse memory settings:
```
SELECT name, value FROM system.settings WHERE name LIKE '%memory%';
```
Ensure max_memory_usage (per-query limit) and max_server_memory_usage are set to reasonable values that fit within your actual available memory.

Identify memory-hungry queries:

SELECT query_id, memory_usage, query
FROM system.processes
ORDER BY memory_usage DESC
LIMIT 10;

Check for OOM killer activity:

dmesg | grep -i "out of memory"
journalctl -k | grep -i oom

Increase available memory: Either add more RAM to the host, increase the container memory limit, or reduce the workload. If running in Kubernetes, adjust the resource limits in your pod specification.

Configure memory overcommit settings on Linux:

# Check current setting
cat /proc/sys/vm/overcommit_memory
# Consider setting to 1 (always overcommit) or adjusting overcommit_ratio

Best Practices

Set max_server_memory_usage to roughly 80-90% of available RAM, leaving headroom for the OS and other processes.
Always configure per-query limits with max_memory_usage to prevent a single query from starving the entire server.
In containerized environments, ensure the container memory limit is at least 20% higher than max_server_memory_usage.
Monitor memory usage proactively using system tables like system.asynchronous_metrics and system.metrics.
Avoid running ClickHouse alongside other memory-intensive services on the same host.
Enable swap as a safety net, though relying on swap for normal operations will severely degrade performance.

Frequently Asked Questions

Q: How is CANNOT_ALLOCATE_MEMORY different from the "Memory limit exceeded" error?
A: The "Memory limit exceeded" error is triggered by ClickHouse's own internal memory tracking when a configured limit like max_memory_usage is reached. CANNOT_ALLOCATE_MEMORY, on the other hand, means the operating system itself denied the memory allocation request -- it is a lower-level failure that ClickHouse cannot prevent through its own settings alone.

Q: Can this error crash the ClickHouse server?
A: Yes. If the OS runs critically low on memory, the Linux OOM killer may terminate the ClickHouse process. Even if that does not happen, failed allocations during critical operations can leave the server in an unstable state requiring a restart.

Q: I have plenty of RAM according to free, but still get this error. Why?
A: This can happen due to memory fragmentation, cgroup limits that are lower than total system memory, or because the memory is allocated but not yet reflected in standard monitoring tools. Check cgroup limits specifically if running in containers.

Q: Should I disable swap to improve ClickHouse performance?
A: While swap can cause performance degradation, having some swap available can prevent OOM kills during temporary memory spikes. A better approach is to size your memory correctly and use ClickHouse's built-in limits to control usage, while keeping a small amount of swap as a safety buffer.