ClickHouse Documentation Hub: Engines, Errors, Functions, Settings

This is the index page for Pulse's ClickHouse knowledge base. It groups every ClickHouse article on this site - SQL DDL and access control, the MergeTree engine family, integration engines, functions, settings, tools, and common production errors - into navigable sections. Start with the MergeTree engine and the common ClickHouse errors page if you're new; jump to a section below if you have a specific question.

SQL DDL: Databases, Tables, and Data

The everyday SQL statements you run against a ClickHouse cluster, with the engine choices and gotchas that matter in production.

CREATE DATABASE - Atomic, Replicated, MySQL, PostgreSQL, and other database engines
CREATE TABLE - Columns, engines, ORDER BY, PARTITION BY, codecs, TTL, ON CLUSTER
MergeTree Engine - The core engine and its full family
CREATE MATERIALIZED VIEW - Streaming aggregates, refreshable MVs, Kafka pipelines
ALTER TABLE - Column changes, mutations, partition operations, MODIFY ORDER BY
INSERT INTO - VALUES, FORMAT, SELECT, async inserts, deduplication, batch sizing
DROP TABLE - SYNC, UNDROP, Atomic vs Ordinary semantics

Access Control and Security

Roles, users, grants, and row-level security. The right place to start when setting up multi-tenant access on a shared cluster.

CREATE USER - Authentication methods, host clauses, default roles, settings limits
CREATE ROLE - RBAC, default roles, layered roles, settings profiles
GRANT and REVOKE - Privilege hierarchy, column-level grants, WITH GRANT/ADMIN OPTION
CREATE ROW POLICY - Row-level security, permissive vs restrictive policies
Settings Profile - Per-user setting bundles and constraints

Table Engines (MergeTree Family and Specials)

ClickHouse's table engines, each designed for a specific data pattern.

MergeTree - Core columnar engine, foundation of every variant
ReplicatedMergeTree - Keeper-coordinated replication
AggregatingMergeTree - Pre-aggregated rollups via -State/-Merge
ReplacingMergeTree - Upsert / CDC patterns
SummingMergeTree - Automatic numeric summation
CollapsingMergeTree - Sign-based incremental updates
VersionedCollapsingMergeTree - Version-aware collapsing
GraphiteMergeTree - Time-series rollup with retention rules
Buffer Engine - In-memory insert buffering
Log Engine - Simple write-once storage
Materialized View - Pre-computed views
Null Engine - Discard writes (useful for MV triggers)
StripeLog - Log-structured storage
TinyLog - Lightweight engine for small tables
View - Virtual tables for query simplification
Replicated variants: Aggregating, Collapsing, Replacing, Summing, VersionedCollapsing

Integration Engines

Connect ClickHouse to external data sources and systems.

File Engine - Read data from files on the filesystem
HDFS Engine - Hadoop Distributed File System
JDBC Engine - External SQL databases via JDBC
Kafka Engine - Stream processing with Apache Kafka
MySQL Engine - MySQL integration
ODBC Engine - Any ODBC-compatible database
RabbitMQ Engine - Message queue integration
S3 Engine - Store and query data in Amazon S3
URL Engine - HTTP/HTTPS endpoints

Functions

ClickHouse's built-in functions for data manipulation and analysis.

Aggregation Functions

anyheavy - Most frequent value (heavy hitter)
avg - Average
countif - Count rows matching a condition
groupUniqArray - Arrays of unique values
max - Maximum value
median - Median (quantile 0.5)
min - Minimum value
quantile - Approximate quantiles
stddevSamp - Sample standard deviation
sum - Sum
topK - Top-K most frequent values
uniq - Approximate distinct count
uniqCombined - Approximate distinct count, tuned

Date and Time Functions

formatDateTime - Format date and time
parseDateTimeBestEffort - Flexible date parsing
toStartOfInterval - Round to interval start
toUnixTimestamp - Convert to Unix timestamp

String Functions

concat - Concatenate strings
lower - Lowercase
replaceAll - Replace all occurrences
substring - Extract substring

Window Functions

denseRank - Dense rank
lagInFrame - Previous-row access
leadInFrame - Following-row access
rank - Rank with gaps
rowNumberInAllBlocks - Row numbers across blocks
runningDifference - Differences between consecutive rows
windowFunnel - Event sequence analysis

Settings and Configuration

Tuning knobs for query execution, memory, and concurrency.

allow_experimental_analyzer - Enable the new query analyzer
allow_suspicious_low_cardinality_types - LowCardinality type warnings
enable_memory_tracker - Memory usage tracking
insert_quorum - Replicated write consistency
join_use_nulls - NULL handling in joins
log_queries - Query logging
max_execution_time - Query timeout
max_memory_usage - Per-query memory limit
max_threads - Thread pool size per query
optimize_skip_unused_shards - Shard pruning
preferred_block_size_bytes - Block size tuning

Tools and Utilities

ClickHouse's CLIs and operational helpers.

ClickHouse Client - Native TCP command-line client
ClickHouse Copier - Obsolete migration tool (use INSERT SELECT remote() or BACKUP/RESTORE instead)
ClickHouse Dictionary - External dictionary management
Dictionary Lazy Load - Optimize dictionary loading
Executable Dictionary - Dynamic dictionary updates
Executable Pool - Resource management for executable sources
Join Engine - Pre-built JOIN tables
Live View - Real-time data views
Memory Management - Memory tracking and tuning
Settings Profile - User configuration bundles
Trace Log - Detailed query profiling

Error Handling

Curated list of common ClickHouse errors with root cause and fix.

Production Monitoring

For real-world ClickHouse operations - replication lag, memory pressure, merge backlogs, slow queries - Pulse provides proactive monitoring and AI-driven root-cause analysis specifically for ClickHouse, with the same agentic SRE approach it brings to Elasticsearch and OpenSearch.