NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

Configure ClickHouse for Low Memory Environments

ClickHouse default settings target servers with plenty of RAM. On hosts with 4 to 8 GB of memory, a Raspberry Pi, or a small VM, the default mark cache alone is capped at 5 GB and background pools spawn dozens of threads. This guide shows the specific server and user settings to lower so ClickHouse stays inside a small memory budget without crashing under load.

What dominates memory on small hosts

Three categories drive memory pressure:

  1. Caches loaded into RAM (mark cache, uncompressed cache, index mark cache).
  2. Background workers (merges, fetches, schedule pools, mutations).
  3. System log tables (query_thread_log, processors_profile_log, opentelemetry_span_log) that write frequently and pull data through the merge and insert path.

Cutting all three brings idle memory consumption down well under a gigabyte.

Server-level configuration

Drop the snippets below into /etc/clickhouse-server/config.d/low-memory.xml.

Cap server memory

<clickhouse>
    <max_server_memory_usage_to_ram_ratio>0.75</max_server_memory_usage_to_ram_ratio>
    <max_concurrent_queries>8</max_concurrent_queries>
    <mlock_executable>false</mlock_executable>
</clickhouse>

max_server_memory_usage_to_ram_ratio reserves 25 percent of physical RAM for the kernel page cache and other processes. Setting mlock_executable to false lets the OS page out parts of the binary it is not actively running.

Shrink caches

<clickhouse>
    <mark_cache_size>268435456</mark_cache_size>          <!-- 256 MB -->
    <index_mark_cache_size>67108864</index_mark_cache_size> <!-- 64 MB -->
    <uncompressed_cache_size>16777216</uncompressed_cache_size> <!-- 16 MB -->
</clickhouse>

Defaults of 5 GB mark cache and 8 GB uncompressed cache make sense on large servers. On constrained hosts they evict useful pages and trigger swap.

Disable noisy system logs

<clickhouse>
    <query_thread_log remove="1"/>
    <opentelemetry_span_log remove="1"/>
    <processors_profile_log remove="1"/>
</clickhouse>

These logs write very frequently. Removing them eliminates a background insert stream and the merges that follow.

Tame background pools

<clickhouse>
    <background_pool_size>2</background_pool_size>
    <background_merges_mutations_concurrency_ratio>2</background_merges_mutations_concurrency_ratio>
    <background_schedule_pool_size>2</background_schedule_pool_size>
    <background_fetches_pool_size>2</background_fetches_pool_size>
    <background_move_pool_size>2</background_move_pool_size>
</clickhouse>

Each background thread holds buffers. Defaults of 16 merge workers and 16 schedule workers are excessive when only a few cores exist. Stay at or below CPU core count.

Limit merge memory

<clickhouse>
    <merge_tree>
        <merge_max_block_size>1024</merge_max_block_size>
        <max_bytes_to_merge_at_max_space_in_pool>1073741824</max_bytes_to_merge_at_max_space_in_pool>
        <min_bytes_for_wide_part>134217728</min_bytes_for_wide_part>
    </merge_tree>
</clickhouse>

Lowering merge_max_block_size from 8192 to 1024 reduces the per-merge buffer footprint. Capping max_bytes_to_merge_at_max_space_in_pool to 1 GB prevents massive merges that need gigabytes of working memory. Raising min_bytes_for_wide_part keeps small parts compact, which reduces the merge memory needed to combine them.

User-level configuration

User profile settings go in /etc/clickhouse-server/users.d/low-memory.xml under the default profile.

<clickhouse>
    <profiles>
        <default>
            <max_threads>2</max_threads>
            <max_block_size>8192</max_block_size>
            <input_format_parallel_parsing>0</input_format_parallel_parsing>
            <max_bytes_before_external_group_by>3221225472</max_bytes_before_external_group_by>
            <queue_max_wait_ms>1000</queue_max_wait_ms>
        </default>
    </profiles>
</clickhouse>

Key points:

  • max_threads=2 keeps query parallelism aligned with available cores.
  • input_format_parallel_parsing=0 removes the parser thread pool from the insert path.
  • max_bytes_before_external_group_by=3G makes GROUP BY spill to disk instead of failing with a memory limit error.
  • queue_max_wait_ms=1000 queues new queries for up to a second instead of failing immediately when max_concurrent_queries is reached.

Verifying the change

After restart, check that limits are in place:

SELECT name, value
FROM system.server_settings
WHERE name IN (
    'max_concurrent_queries',
    'max_server_memory_usage_to_ram_ratio',
    'mark_cache_size',
    'background_pool_size'
);

SELECT formatReadableSize(value) AS mem
FROM system.asynchronous_metrics
WHERE metric = 'MemoryResident';

Resident memory should sit well below the configured ratio of total RAM at idle.

Common Pitfalls

  • Lowering caches but not background pool sizes. Background threads each hold buffers; sixteen of them still consume hundreds of megabytes.
  • Leaving query_thread_log enabled. On a Raspberry Pi it can write tens of thousands of rows per minute, generating merges that compete for RAM.
  • Setting max_bytes_before_external_group_by higher than available RAM. The setting only enables spilling once memory exceeds the threshold, so it must be lower than max_memory_usage.
  • Disabling caches entirely (setting them to 0). A 16 MB uncompressed cache is still useful and avoids re-decompressing the same blocks repeatedly.
  • Forgetting that mlock_executable defaults to true. Without disabling it, the entire binary stays resident even when idle.

Frequently Asked Questions

Q: Will ClickHouse run on 2 GB of RAM? A: It can boot, but production workloads are not realistic. Even with all the trimming above, the merge and replication subsystems need headroom. 4 GB is a practical floor.

Q: Should I disable swap? A: No. Leave a few GB of swap available. With max_server_memory_usage_to_ram_ratio=0.75, the kernel rarely touches swap, but it acts as a safety valve during merges.

Q: Do these settings hurt query performance? A: Yes, somewhat. Smaller caches mean more disk reads; lower max_threads means slower per-query latency. The tradeoff is stability over peak throughput.

Q: Can I keep query_log enabled? A: Yes. query_log is far less verbose than query_thread_log. Keep it for observability and remove the per-thread variants.

Q: How do I keep external GROUP BY from filling disk? A: Set max_temporary_files and max_temporary_data_on_disk_size_for_user to bound the spill. Combined with queue_max_wait_ms, this prevents runaway disk usage on small hosts.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.