NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

ClickHouse Settings to Adjust from Defaults in Production

ClickHouse defaults aim for safety on small installations rather than throughput on production clusters. Several settings should be adjusted on any non-trivial deployment. This guide lists the ones that matter most, grouped by area, with the values commonly used in production.

System Logging

Log tables grow forever by default. Add TTL to bound their size:

<query_log>
    <database>system</database>
    <table>query_log</table>
    <ttl>event_date + INTERVAL 90 DAY DELETE</ttl>
</query_log>

Recommended retentions:

Table TTL
query_log 30 to 90 days
query_thread_log 7 to 14 days, or disable
trace_log 14 days
metric_log 14 days, or disable if external monitoring exists
asynchronous_metric_log 14 days, or remove via <asynchronous_metric_log remove="1"/>
part_log 90 days, useful for merge analysis

Keep query_log enabled. It has very useful information for debugging.

query_thread_log can be disabled per user without restart:

<profiles>
    <default>
        <log_query_threads>0</log_query_threads>
    </default>
</profiles>

Query Logging Defaults

Enable query logging globally for traceability:

<profiles>
    <default>
        <log_queries>1</log_queries>
    </default>
</profiles>

External Aggregation and Sort

By default, large GROUP BY and ORDER BY queries fail when they exceed max_memory_usage. Enable spill-to-disk so they complete instead:

<profiles>
    <default>
        <max_bytes_before_external_group_by>2000000000</max_bytes_before_external_group_by>
        <max_bytes_before_external_sort>2000000000</max_bytes_before_external_sort>
    </default>
</profiles>

2 GB is a starting point. Set to roughly half of max_memory_usage so the spill kicks in before the hard limit.

Prevent Accidental Full Scans

force_index_by_date and force_primary_key make ClickHouse reject queries that would scan an entire table because they cannot use the partition or primary key:

<profiles>
    <default>
        <force_index_by_date>1</force_index_by_date>
        <force_primary_key>1</force_primary_key>
        <max_execution_time>300</max_execution_time>
    </default>
</profiles>

Combined with max_execution_time, these settings catch the queries that take a server down.

Concurrency Limits

The defaults are conservative for modern hardware. For workloads with many async inserts and parallel selects, raise them:

<max_concurrent_queries>500</max_concurrent_queries>
<max_concurrent_insert_queries>400</max_concurrent_insert_queries>
<max_concurrent_select_queries>100</max_concurrent_select_queries>

These live at the top level of config.xml, not under <profiles>. They are server-wide caps.

FINAL Query Optimization

If you use SELECT ... FINAL against ReplacingMergeTree or similar engines, enabling this setting allows the final merge to run per partition in parallel:

<profiles>
    <default>
        <do_not_merge_across_partitions_select_final>1</do_not_merge_across_partitions_select_final>
    </default>
</profiles>

It is safe only when the table's logical merge happens entirely within a partition, which is the normal case for time-partitioned data.

Memory Limits

Configure memory caps at three levels:

<profiles>
    <default>
        <max_memory_usage>10000000000</max_memory_usage>
        <max_memory_usage_for_user>20000000000</max_memory_usage_for_user>
    </default>
</profiles>

<max_server_memory_usage_to_ram_ratio>0.9</max_server_memory_usage_to_ram_ratio>

A typical breakdown: per-query 25% of RAM, per-user 50%, server 90%. Adjust to your workload.

RBAC

Enable SQL-driven user management instead of editing users.xml files:

<users>
    <default>
        <access_management>1</access_management>
    </default>
</users>

Now you can use CREATE USER, CREATE ROLE, and GRANT statements at runtime.

Durability

If inserts cannot be re-driven, raise the durability bar with quorum writes:

<profiles>
    <default>
        <insert_quorum>2</insert_quorum>
        <insert_quorum_timeout>60000</insert_quorum_timeout>
        <select_sequential_consistency>1</select_sequential_consistency>
    </default>
</profiles>

With insert_quorum = 2, an insert is only acknowledged once two replicas have it. Use 0 (default) on workloads that can tolerate replay.

Common Pitfalls

  • Setting max_memory_usage too high so a single query can OOM the box.
  • Enabling force_primary_key on a cluster where many existing dashboards use full scans. Roll out per-user first.
  • Forgetting that max_concurrent_queries is server-wide. Per-user limits go under <profiles>.
  • Setting insert_quorum without insert_quorum_timeout. A failing replica stalls every insert.
  • Adding <engine> blocks to log tables in Kubernetes. Stick to <ttl> to avoid clashes with chart upgrades.

Frequently Asked Questions

Q: Which setting catches the most production issues with the least friction? A: Adding TTL to query_log and friends. It silently prevents the disk-fills-up incident that affects almost every long-running cluster.

Q: Should I enable force_primary_key everywhere? A: Eventually yes, but roll it out to one profile at a time. Existing dashboards may rely on key-less scans.

Q: How do I pick max_bytes_before_external_group_by? A: Roughly half of max_memory_usage. The threshold should fire before the hard memory cap, otherwise queries fail instead of spilling.

Q: Do I need to restart for these changes? A: User profile settings are picked up on the next connection. Server-level settings (max_concurrent_queries, log table TTL via <ttl>) usually need a config reload, which is a SIGHUP, not a restart.

Q: Are these settings safe to apply via Kubernetes ConfigMaps? A: Yes, with one caveat: avoid <engine> overrides for log tables, as the documentation calls out possible configuration clashes during upgrades.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.