The system database in ClickHouse mixes two very different things. Virtual tables like system.parts and system.columns are computed on demand and consume no disk. The _log tables (query_log, query_thread_log, trace_log, part_log, metric_log, asynchronous_metric_log, and so on) are real MergeTree tables that grow forever by default. On a busy server, they easily reach hundreds of gigabytes. This guide shows how to control them without losing the operational data you need.
Option 1: Add TTL to Existing Log Tables
The least invasive fix is to set a TTL on each log table so old data drops automatically. This works on a running server with no restart:
ALTER TABLE system.query_log MODIFY TTL event_date + INTERVAL 14 DAY;
ALTER TABLE system.query_thread_log MODIFY TTL event_date + INTERVAL 7 DAY;
ALTER TABLE system.trace_log MODIFY TTL event_date + INTERVAL 14 DAY;
ALTER TABLE system.part_log MODIFY TTL event_date + INTERVAL 90 DAY;
ALTER TABLE system.metric_log MODIFY TTL event_date + INTERVAL 14 DAY;
ALTER TABLE system.asynchronous_metric_log MODIFY TTL event_date + INTERVAL 14 DAY;
Because the log tables are already partitioned by month or day, TTL drops whole partitions cheaply rather than rewriting parts.
Option 2: Configure TTL at Server Startup
If you want new servers to have the right retention from day one, configure it in the log table sections of config.xml or a drop-in:
<clickhouse>
<query_log>
<database>system</database>
<table>query_log</table>
<ttl>event_date + INTERVAL 30 DAY DELETE</ttl>
<flush_interval_milliseconds>7500</flush_interval_milliseconds>
</query_log>
</clickhouse>
You can also override the engine to use explicit daily partitioning. Daily partitions let TTL drop whole days:
<query_log>
<database>system</database>
<table>query_log</table>
<engine>ENGINE = MergeTree
PARTITION BY event_date
ORDER BY (event_time)
TTL event_date + INTERVAL 14 DAY DELETE</engine>
</query_log>
Note: in Kubernetes deployments, avoid the <engine> tag because of possible configuration clashes during chart upgrades. Use the <ttl> form instead.
Option 3: Disable Specific Log Tables
If a log table provides no value for your workload, remove it. Create /etc/clickhouse-server/config.d/z_log_disable.xml (the z_ prefix ensures it loads last, after vendor configs):
<?xml version="1.0"?>
<clickhouse>
<asynchronous_metric_log remove="1"/>
<metric_log remove="1"/>
<trace_log remove="1"/>
<part_log remove="1"/>
<text_log remove="1"/>
<crash_log remove="1"/>
<session_log remove="1"/>
</clickhouse>
Keep query_log enabled. It is by far the most useful log for debugging slow queries, memory issues, and unexpected workload patterns.
Option 4: Throttle Query Logging Per User
Rather than disable query_log, reduce its volume per user profile:
| Setting | Effect |
|---|---|
log_queries_min_duration_ms |
Skip queries below the threshold |
log_queries_cut_to_length |
Truncate long query texts |
log_queries_probability |
Sample only a fraction of queries |
log_query_threads |
Disable the per-thread log |
log_query_views |
Disable view-access logging |
log_processors_profiles |
Disable processor-level profiling logs |
Set these in users.xml for the relevant profile:
<profiles>
<default>
<log_queries>1</log_queries>
<log_queries_min_duration_ms>100</log_queries_min_duration_ms>
<log_query_threads>0</log_query_threads>
</default>
</profiles>
Per-user settings take effect immediately, no restart needed.
Drop Renamed Tables from Upgrades
When ClickHouse upgrades introduce a schema change to a log table, the old table is renamed with a numeric suffix (query_log_1, query_log_2, ...) and a new empty one is created. These accumulate over time. If you do not need the historical data, drop them:
SELECT name, formatReadableSize(total_bytes) AS size
FROM system.tables
WHERE database = 'system' AND match(name, '_[0-9]+$')
ORDER BY total_bytes DESC;
DROP TABLE system.query_log_1 SYNC;
Common Pitfalls
- Removing
query_logentirely. You lose your most useful debugging tool. Use TTL or sampling instead. - Setting TTL to a very short interval like 1 day on a system with deep historical investigations needed.
- Forgetting to drop renamed
_log_Ntables after upgrades. They keep consuming disk forever. - Disabling logs on one node but not the others. Cluster-wide visibility breaks.
Frequently Asked Questions
Q: Which system log table grows the fastest?
A: Usually trace_log followed by query_thread_log and asynchronous_metric_log. query_log itself is moderate but dominates if queries are long.
Q: Can I add TTL without restarting ClickHouse?
A: Yes, ALTER TABLE ... MODIFY TTL is online. New parts get the TTL; existing parts are evaluated on the next merge.
Q: How long should I keep query_log?
A: 14 to 30 days is typical. Long enough to investigate complaints from last week, short enough to bound disk usage.
Q: Will dropping a log table break ClickHouse? A: No. ClickHouse will recreate it on the next flush if the corresponding section in config is still enabled.
Q: Should I move log tables to a slower disk? A: You can, via a storage policy on each log table. It saves fast-disk space at the cost of slower queries against the logs.