This is the index page for Pulse's ClickHouse knowledge base. It groups every ClickHouse article on this site - SQL DDL and access control, the MergeTree engine family, integration engines, functions, settings, tools, and common production errors - into navigable sections. Start with the MergeTree engine and the common ClickHouse errors page if you're new; jump to a section below if you have a specific question.
SQL DDL: Databases, Tables, and Data
The everyday SQL statements you run against a ClickHouse cluster, with the engine choices and gotchas that matter in production.
- CREATE DATABASE - Atomic, Replicated, MySQL, PostgreSQL, and other database engines
- CREATE TABLE - Columns, engines, ORDER BY, PARTITION BY, codecs, TTL, ON CLUSTER
- MergeTree Engine - The core engine and its full family
- CREATE MATERIALIZED VIEW - Streaming aggregates, refreshable MVs, Kafka pipelines
- ALTER TABLE - Column changes, mutations, partition operations, MODIFY ORDER BY
- INSERT INTO - VALUES, FORMAT, SELECT, async inserts, deduplication, batch sizing
- DROP TABLE - SYNC, UNDROP, Atomic vs Ordinary semantics
Access Control and Security
Roles, users, grants, and row-level security. The right place to start when setting up multi-tenant access on a shared cluster.
- CREATE USER - Authentication methods, host clauses, default roles, settings limits
- CREATE ROLE - RBAC, default roles, layered roles, settings profiles
- GRANT and REVOKE - Privilege hierarchy, column-level grants, WITH GRANT/ADMIN OPTION
- CREATE ROW POLICY - Row-level security, permissive vs restrictive policies
- Settings Profile - Per-user setting bundles and constraints
Table Engines (MergeTree Family and Specials)
ClickHouse's table engines, each designed for a specific data pattern.
- MergeTree - Core columnar engine, foundation of every variant
- ReplicatedMergeTree - Keeper-coordinated replication
- AggregatingMergeTree - Pre-aggregated rollups via -State/-Merge
- ReplacingMergeTree - Upsert / CDC patterns
- SummingMergeTree - Automatic numeric summation
- CollapsingMergeTree - Sign-based incremental updates
- VersionedCollapsingMergeTree - Version-aware collapsing
- GraphiteMergeTree - Time-series rollup with retention rules
- Buffer Engine - In-memory insert buffering
- Log Engine - Simple write-once storage
- Materialized View - Pre-computed views
- Null Engine - Discard writes (useful for MV triggers)
- StripeLog - Log-structured storage
- TinyLog - Lightweight engine for small tables
- View - Virtual tables for query simplification
- Replicated variants: Aggregating, Collapsing, Replacing, Summing, VersionedCollapsing
Integration Engines
Connect ClickHouse to external data sources and systems.
- File Engine - Read data from files on the filesystem
- HDFS Engine - Hadoop Distributed File System
- JDBC Engine - External SQL databases via JDBC
- Kafka Engine - Stream processing with Apache Kafka
- MySQL Engine - MySQL integration
- ODBC Engine - Any ODBC-compatible database
- RabbitMQ Engine - Message queue integration
- S3 Engine - Store and query data in Amazon S3
- URL Engine - HTTP/HTTPS endpoints
Functions
ClickHouse's built-in functions for data manipulation and analysis.
Aggregation Functions
- anyheavy - Most frequent value (heavy hitter)
- avg - Average
- countif - Count rows matching a condition
- groupUniqArray - Arrays of unique values
- max - Maximum value
- median - Median (quantile 0.5)
- min - Minimum value
- quantile - Approximate quantiles
- stddevSamp - Sample standard deviation
- sum - Sum
- topK - Top-K most frequent values
- uniq - Approximate distinct count
- uniqCombined - Approximate distinct count, tuned
Date and Time Functions
- formatDateTime - Format date and time
- parseDateTimeBestEffort - Flexible date parsing
- toStartOfInterval - Round to interval start
- toUnixTimestamp - Convert to Unix timestamp
String Functions
- concat - Concatenate strings
- lower - Lowercase
- replaceAll - Replace all occurrences
- substring - Extract substring
Window Functions
- denseRank - Dense rank
- lagInFrame - Previous-row access
- leadInFrame - Following-row access
- rank - Rank with gaps
- rowNumberInAllBlocks - Row numbers across blocks
- runningDifference - Differences between consecutive rows
- windowFunnel - Event sequence analysis
Settings and Configuration
Tuning knobs for query execution, memory, and concurrency.
- allow_experimental_analyzer - Enable the new query analyzer
- allow_suspicious_low_cardinality_types - LowCardinality type warnings
- enable_memory_tracker - Memory usage tracking
- insert_quorum - Replicated write consistency
- join_use_nulls - NULL handling in joins
- log_queries - Query logging
- max_execution_time - Query timeout
- max_memory_usage - Per-query memory limit
- max_threads - Thread pool size per query
- optimize_skip_unused_shards - Shard pruning
- preferred_block_size_bytes - Block size tuning
Tools and Utilities
ClickHouse's CLIs and operational helpers.
- ClickHouse Client - Native TCP command-line client
- ClickHouse Copier - Obsolete migration tool (use INSERT SELECT remote() or BACKUP/RESTORE instead)
- ClickHouse Dictionary - External dictionary management
- Dictionary Lazy Load - Optimize dictionary loading
- Executable Dictionary - Dynamic dictionary updates
- Executable Pool - Resource management for executable sources
- Join Engine - Pre-built JOIN tables
- Live View - Real-time data views
- Memory Management - Memory tracking and tuning
- Settings Profile - User configuration bundles
- Trace Log - Detailed query profiling
Error Handling
Curated list of common ClickHouse errors with root cause and fix.
- Cannot Create New ZooKeeper Session
- Cannot Load Table
- Cannot Read from Socket
- Cannot Write to Table
- DB::Exception: Read timeout
- Directory Already Exists
- Distributed Query Execution Failed
- Division by Zero
- Memory Limit Exceeded
- No Active Replicas
- No Query for ID
- Table Readonly Mode
- Table Structure Does Not Match
- Too Many Parts
- Too Many Simultaneous Connections
- Unknown Function
- Wrong Password
- ZooKeeper Node Exists
Production Monitoring
For real-world ClickHouse operations - replication lag, memory pressure, merge backlogs, slow queries - Pulse provides proactive monitoring and AI-driven root-cause analysis specifically for ClickHouse, with the same agentic SRE approach it brings to Elasticsearch and OpenSearch.