Most production ClickHouse problems trace back to skipping early validation. Schema decisions that look fine on 100MB of synthetic data fall apart at 1TB, and missing monitoring or backup procedures only become urgent the day they are needed. A staged plan keeps assumptions cheap to fix. The four stages below cover proof of concept, production planning, pre-production setup, and the production cutover itself.
Stage 0: Proof of Concept
The goal of a POC is to validate that ClickHouse fits the workload and that your schema design is defensible. Run on a single node, install from official packages, and avoid clusters at this stage. A cluster adds operational variables that obscure schema mistakes.
Pick the largest table you expect to have in production and build the first version with MergeTree. Load at least 5GB of representative data, preferably an export of real production data rather than synthetic samples. Cardinality, value distributions, and compression behavior all depend on real shapes.
Iterate on the schema with these checks:
- Analyze column cardinality and pick types that match (e.g.
LowCardinality(String)for low-cardinality strings, smaller integer types where possible). - Re-evaluate the
ORDER BYandPARTITION BYchoices against the actual query patterns. - Test codec variations (
CODEC(ZSTD(3)),Delta,DoubleDelta,Gorilla) on time-series and counter columns. - Avoid joins between two large tables. Prefer denormalization for immutable dimensions and dictionaries for mutable ones.
When query performance is short of target, before adding hardware, try PREWHERE, projections, and skip indexes. Build a small representative query set and time it consistently across schema changes.
For the ingest path, pick a mature client library (Python clickhouse-connect, Go clickhouse-go, JDBC, C++). Design pipelines around low queries-per-second with large inserts (tens of thousands to millions of rows per batch), proper acknowledgment, and retry of the failed block on transient errors.
Stage 1: Production Planning
With a working POC, collect the numbers that will drive the architecture.
| Metric | Why it matters |
|---|---|
| Insert rows/sec and bytes/sec | Sizes ingest pipeline and disk write throughput |
| Compressed bytes per column per day | Drives storage sizing and retention math |
| p50/p95/p99 query latency on POC | Baseline for cluster sizing and SLOs |
| Concurrent query count | Affects max_concurrent_queries and thread pool tuning |
| Retention | Drives TTL design and total storage requirement |
Define non-functional requirements explicitly: high availability target (single-node, single-shard replicated, multi-shard replicated), required durability and consistency model, recovery objectives (RTO/RPO), and security posture (TLS, RBAC, network isolation).
Capacity planning starts from compressed bytes per day multiplied by retention, plus headroom (50% is a reasonable starting point for compaction and temporary parts). Once total bytes are estimated, decide whether one shard with replicas is enough or whether sharding is required to fit storage and concurrent query throughput.
If sharding is required, pick a sharding key that distributes evenly and aligns with the most common filter columns. Skewed shard keys lead to hotspots that cluster scaling cannot fix.
Stage 2: Pre-Production Setup
This stage produces an environment that mirrors production. Use the same OS, same ClickHouse version, same Keeper/ZooKeeper topology, same storage class, and the same configuration management.
Items to put in place:
- Cluster install: multiple ClickHouse nodes with ClickHouse Keeper (or ZooKeeper). Three Keeper nodes is the standard production quorum.
- Configuration management: Ansible, Puppet, Salt, or the ClickHouse Operator for Kubernetes. No hand-edited XML in production.
- Monitoring: scrape
system.metrics,system.events,system.asynchronous_metrics, and host-level metrics. Set alerts on replication lag, Keeper session loss, disk free, memory usage, and failed queries. - Log aggregation: forward
clickhouse-server.logandclickhouse-server.err.logto your central log system. - RBAC: define roles, grants, and quotas via SQL with
access_management = 1enabled. - Schema migration tooling: a workflow that applies DDL
ON CLUSTERwith review and rollback. - Backups:
clickhouse-backupor equivalent, with restore tested on a separate environment, not just backup creation. Define retention and offsite copy policy. - Test plan: functional tests for queries, performance regression tests, restart drills, kill-a-replica drills, failover drills, and upgrade drills.
- Security review: TLS on HTTP and native ports for non-trusted networks, query quotas, network ACLs, credential rotation, secret storage.
Treat the result of Stage 2 as the production runbook source. Operators who have not seen this cluster before should be able to perform routine operations from documentation written here.
Stage 3: Production Deployment
Production cutover is execution, not design. Designs should be locked before this point.
For traffic migration off an existing system, prefer a canary or blue/green pattern: route a small percentage of read traffic to ClickHouse, compare results against the source, then increase. Writes should dual-write or replicate from the source system until the cutover is confirmed.
After cutover, plan a recurring upgrade cadence. Major version upgrades every 6 to 12 months balance access to fixes and features with the cost of validation. Skip multiple major versions only when the source notes a supported upgrade path. Upgrade non-production environments first and run the regression suite from Stage 2.
Common Pitfalls
- Building a cluster before validating the schema. Cluster scaling does not fix a bad
ORDER BY. - Loading synthetic data in the POC. Cardinality and compression behave very differently from real data.
- Skipping the failover drill. A failover procedure that has never been executed is not a procedure.
- Treating backups as the recovery plan. The recovery plan is the documented and tested restore path.
- Underestimating Keeper. Three dedicated nodes with fast disks are not optional for production replication.
- No schema migration discipline.
ALTERon replicated tables across a cluster needs review, version tracking, and a rollback story.
Frequently Asked Questions
Q: Can the POC and pre-production use the same hardware? A: They can share a budget, but the topology should differ. POC validates correctness on one node. Pre-production validates the operational model of the production topology.
Q: How much data is enough in a POC? A: A minimum of 5GB of representative data, but enough to make at least one full partition realistic. Time-based partitioning means a few days of production data is usually a better target than a fixed byte count.
Q: When do you actually need sharding? A: When a single replica set cannot hold the data under retention, or when concurrent query throughput exceeds what a single shard can serve. Replication alone scales reads and provides HA but not storage.
Q: ClickHouse Keeper or ZooKeeper for new deployments? A: ClickHouse Keeper is the current recommendation for new deployments. It is protocol-compatible with ZooKeeper for ClickHouse coordination and is maintained as part of the ClickHouse project.
Q: How often should you upgrade ClickHouse in production? A: A 6 to 12 month cadence on a stable or LTS-style channel is a workable default. The right cadence is the one your test suite and operational team can absorb without skipping validation.