Schema Design for High-Volume Metrics from Single Source

Q: Should I use `Map` or `Nested` arrays for metrics?

They are functionally close — Map(K, V) is stored as Array(Tuple(K, V)) under the hood. Map gives cleaner metrics['name'] access and supports bloom-filter skip indexes on mapKeys() / mapValues() ; the Nested(name, value) form uses indexOf() for lookups. Use whichever fits your access pattern; wrap the key/name in LowCardinality(String) either way.

Q: How should I partition a metrics table?

Partition by time — toYYYYMM(ts) (monthly) is a good default, daily only for very high-volume tables needing fine-grained TTL or DROP PARTITION . Never partition by source_id or metric_name ; high-cardinality partition keys create too many small parts. Keep those columns in ORDER BY instead.

Q: How do I make a metric from a Map queryable as fast as a real column?

Add a MATERIALIZED column that extracts it, e.g. ADD COLUMN cpu_user Float64 MATERIALIZED metrics['cpu_user'] . It is computed on insert, so new rows gain full scalar-column performance. Existing rows are not backfilled automatically — run ALTER TABLE ... MATERIALIZE COLUMN cpu_user to populate historical data via a background mutation. Promote a metric this way once it becomes query-hot.

When a single source — a host, a device, a sensor, an application instance — emits dozens or hundreds of distinct metrics, the layout you choose for those metrics dominates both storage cost and query speed. ClickHouse can store the same metric stream as one column per metric, one row per metric, or a key-value structure, and each choice trades write flexibility against read performance.

This guide covers the three canonical layouts, how to combine them, and the partitioning and write patterns that keep a high-volume metrics table fast. For the underlying CREATE TABLE mechanics referenced throughout, see the CREATE TABLE reference and the MergeTree guide.

The Three Core Layouts

There is no single "best" schema — the right one depends on whether your metric set is fixed or dynamic, and whether the data is dense or sparse.

Layout	Shape	Best when	Weakness
Wide (column per metric)	`ts, source_id, metric1, metric2, …, metricN`	Metric set is known and stable; data is dense	Schema change to add a metric; sparse data wastes space on nulls/defaults
Narrow (row per metric)	`ts, source_id, metric_name, value`	Highly dynamic or sparse; new metrics appear often	Mixed metric values share one column (worse compression); reading all metrics for one timepoint scans many rows
Key-value (Map / arrays)	`ts, source_id, metrics Map(...)`	Dynamic metrics, moderate sparsity, want one row per timepoint	Heavy aggregation across the structure is slower than scalar columns

Wide: one column per metric

CREATE TABLE metrics_wide
(
    ts            DateTime,
    source_id     UInt32,
    cpu_user      Float64,
    cpu_system    Float64,
    mem_used      Float64,
    disk_io_read  Float64
    -- ... one column per metric
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(ts)
ORDER BY (source_id, ts);

This is the most ClickHouse-native layout: each metric gets a specialized, well-compressed data type, and a query that reads three metrics only touches three columns regardless of how many exist. The Altinity benchmark on a 10,000-column table found that single-metric aggregations stayed in the tens-of-milliseconds range — columnar storage means unused columns cost nothing to read.

The cost is schema rigidity (adding a metric means an ALTER TABLE ... ADD COLUMN) and storage waste when the data is sparse — every row stores a default for every metric the source did not emit at that timepoint.

Narrow: one row per metric

CREATE TABLE metrics_narrow
(
    ts           DateTime,
    source_id    UInt32,
    metric_name  LowCardinality(String),
    value        Float64
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(ts)
ORDER BY (source_id, metric_name, ts);

The narrow form trivially supports new metrics with no schema change and is compact for sparse data — absent metrics simply have no row. The downsides: all metric values share one Float64 column, so values with very different distributions compress worse than they would in dedicated columns, and reconstructing a full snapshot for one source/timepoint requires gathering rows that are spread across the part rather than sitting in one row.

Putting metric_name in the ORDER BY (and using LowCardinality(String) for it) is what makes this layout viable — see LowCardinality for metric names below.

Key-value: Map or parallel arrays

The key-value layout keeps one row per source/timepoint but stores the metrics as a Map or as parallel arrays, so the set of metrics can vary per row without schema changes:

CREATE TABLE metrics_map
(
    ts         DateTime,
    source_id  UInt32,
    metrics    Map(LowCardinality(String), Float64)
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(ts)
ORDER BY (source_id, ts);

Read a single metric by key:

SELECT ts, metrics['cpu_user'] AS cpu_user
FROM metrics_map
WHERE source_id = 42;

Map(K, V) is internally stored as Array(Tuple(K, V)); the keys and values are accessible via mapKeys()/mapValues() (or metrics.keys/metrics.values). Accessing a missing key returns the value type's default rather than an error. The equivalent Nested/array form is functionally similar:

CREATE TABLE metrics_arrays
(
    ts         DateTime,
    source_id  UInt32,
    metrics    Nested(
        name   LowCardinality(String),
        value  Float64
    )
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(ts)
ORDER BY (source_id, ts);

-- pull one metric out of the parallel arrays
SELECT max(metrics.value[indexOf(metrics.name, 'cpu_user')])
FROM metrics_arrays
WHERE source_id = 42;

The trade-off is read performance on heavy queries: ClickHouse aggregates scalar columns extremely fast, but aggregating a value pulled out of a map or array per row is meaningfully slower, and a query touching the map must read the whole map column rather than a single scalar. The key-value form is also far more compact than a wide table for sparse data, because it stores nothing for absent metrics instead of a null/default per column.

LowCardinality for Metric Names

Whenever a metric name is stored as data (narrow or key-value layouts), wrap it in LowCardinality(String). ClickHouse then stores a small integer key per row plus a separate dictionary of the distinct names, which shrinks storage and speeds up filtering and grouping on the name. This is the same representation the built-in TimeSeries table engine uses for its metric_name column. Note that LowCardinality is a win precisely because the number of distinct metric names is small relative to the row count — do not apply it to genuinely high-cardinality fields like raw IDs.

Combining Layouts: the Pareto Pattern

In practice the strongest design for "many metrics from one source" is a hybrid. Store everything in a flexible key-value column for completeness, then promote the small set of frequently queried, high-density metrics into dedicated scalar columns using MATERIALIZED:

ALTER TABLE metrics_map
    ADD COLUMN cpu_user Float64 MATERIALIZED metrics['cpu_user'];

For the array layout the extraction expression uses indexOf:

ALTER TABLE metrics_arrays
    ADD COLUMN cpu_user Float64
    MATERIALIZED metrics.value[indexOf(metrics.name, 'cpu_user')];

MATERIALIZED columns are computed on insert (and during merges), so the hot metrics get the full speed of dedicated scalar columns and your dashboards query cpu_user directly, while rarely touched metrics stay in the map "catch-all." Note that existing rows are not backfilled automatically when you add the column — run ALTER TABLE metrics_map MATERIALIZE COLUMN cpu_user to rewrite historical data with the expression (this runs as a mutation). Adding a column in ClickHouse is cheap and a common ongoing practice — promote a metric to its own column once it becomes query-hot. A materialized view is the right tool when you instead need to reshape the data into a separate destination table or pre-aggregate it.

Partitioning Strategy

Metrics tables are append-heavy and time-ordered, which makes partitioning by time the standard choice. PARTITION BY toYYYYMM(ts) (monthly) is a safe default; use toYYYYMMDD(ts) (daily) only for very high-volume tables where you need partition-level TTL or DROP PARTITION at daily granularity.

Keep the partition count modest. Each partition is a separate set of parts, and too many partitions hurts merges, increases the part count, and slows queries — a few hundred active partitions is a reasonable ceiling, not tens of thousands. Do not partition by source_id or metric_name: high-cardinality partition keys create an explosion of tiny parts. Put those columns in the ORDER BY instead, which is what actually accelerates per-source and per-metric reads. For the relationship between PARTITION BY, ORDER BY, and the primary index, see the CREATE TABLE reference and MergeTree guide.

A typical ordering for single-source metrics is ORDER BY (source_id, ts) (or (source_id, metric_name, ts) for the narrow layout), which clusters each source's history together so range scans for one source read contiguous data.

Pre-Aggregation for Rollups

Raw per-second metrics are rarely what dashboards read; they read minute or hour rollups. Rather than re-aggregating raw data on every query, pre-aggregate with a SummingMergeTree (for additive counters) or an AggregatingMergeTree (for avg/min/max/quantile state), populated by a materialized view that rolls timestamps up with toStartOfMinute() or toStartOfHour():

CREATE TABLE metrics_1m
(
    minute       DateTime,
    source_id    UInt32,
    metric_name  LowCardinality(String),
    value_sum    Float64,
    value_count  UInt64
)
ENGINE = SummingMergeTree
PARTITION BY toYYYYMM(minute)
ORDER BY (source_id, metric_name, minute);

CREATE MATERIALIZED VIEW metrics_1m_mv TO metrics_1m AS
SELECT
    toStartOfMinute(ts) AS minute,
    source_id,
    metric_name,
    sum(value)   AS value_sum,
    count()      AS value_count
FROM metrics_narrow
GROUP BY minute, source_id, metric_name;

For metrics that are emitted on a fixed schedule and need time-based rollup with retention rules, the purpose-built GraphiteMergeTree engine handles rollup configuration natively.

Best Practices

Match the layout to the data shape. Dense + fixed metric set → wide table. Sparse or dynamic → key-value (Map) or narrow. When unsure, start key-value and promote hot metrics to columns.
Always use LowCardinality(String) for stored metric names. It saves space and accelerates filtering and grouping.
Partition by time, not by source or metric. Monthly is a good default; high-cardinality partition keys produce too many parts.
Put source_id (and metric_name for narrow tables) in ORDER BY, not in PARTITION BY.
Promote hot metrics with MATERIALIZED columns instead of forcing every metric into the wide schema up front. Adding columns later is cheap.
Pre-aggregate rollups with SummingMergeTree/AggregatingMergeTree + a materialized view so dashboards read summaries, not raw points.
Batch inserts. Insert in large blocks (tens of thousands of rows or more) rather than row-by-row; many tiny inserts create many small parts and overload background merges.

Common Issues

Too many columns. Around 1,000 columns works with default settings; pushing toward 10,000 requires tuning max_query_size and max_ast_elements (and significant RAM for inserts and merges). If you are near these limits, you almost certainly want a key-value layout instead of an ever-wider table.
Slow heavy aggregations on Map/array columns. Aggregating a value extracted from a map per row is much slower than aggregating a scalar column. If a map-stored metric becomes query-hot, promote it to a MATERIALIZED scalar column.
Merge pressure from sparse wide tables. Sparse wide tables store a default for every absent metric, inflating storage and making merges the bottleneck. Switch sparse data to a key-value layout.
High-cardinality partition keys. Partitioning by source_id/metric_name creates a flood of tiny parts and degrades everything. Keep those in ORDER BY.
Wrong rollup engine. Use SummingMergeTree only for additive metrics; for averages, min/max, or quantiles use AggregatingMergeTree with the matching -State/-Merge aggregate functions.

How Pulse Helps

Pulse (pulse.support) monitors ClickHouse clusters and surfaces exactly the symptoms a poorly chosen metrics schema produces: runaway part counts from over-partitioning, merge backlogs from sparse wide tables or too-small inserts, and tables whose ORDER BY does not match the query patterns hitting them. It flags partition-key cardinality problems and unmerged-part growth before they turn into query latency, so you can correct a metrics schema while the table is still small rather than after it has grown to billions of rows. Pulse is run by ClickHouse and search infrastructure experts who can advise on layout choices for high-volume metric workloads.

Frequently Asked Questions

Q: Wide or narrow table for many metrics from one source?

If the metric set is fixed and most metrics are present at each timepoint (dense), a wide table — one column per metric — gives the best query performance because ClickHouse only reads the columns a query touches. If metrics are sparse or appear dynamically, a narrow (row-per-metric) or key-value (Map) layout avoids schema churn and storage waste. A hybrid — key-value storage with hot metrics promoted to MATERIALIZED columns — is often the best of both.

Q: How many columns can a ClickHouse table have?

Roughly 1,000 columns work with default settings. Tables approaching 10,000 columns are possible but need increased max_query_size and max_ast_elements plus substantial RAM for inserts and merges. At that scale a key-value layout is usually the better design.

Q: Should I use Map or Nested arrays for metrics?

They are functionally close — Map(K, V) is stored as Array(Tuple(K, V)) under the hood. Map gives cleaner metrics['name'] access and supports bloom-filter skip indexes on mapKeys()/mapValues(); the Nested(name, value) form uses indexOf() for lookups. Use whichever fits your access pattern; wrap the key/name in LowCardinality(String) either way.

Q: Why use LowCardinality(String) for metric names?

Because the number of distinct metric names is small relative to the number of rows. LowCardinality stores an integer key per row plus a shared dictionary, cutting storage and speeding up filters and GROUP BY on the name. It is the same encoding ClickHouse's TimeSeries engine uses for metric_name.

Q: How should I partition a metrics table?

Partition by time — toYYYYMM(ts) (monthly) is a good default, daily only for very high-volume tables needing fine-grained TTL or DROP PARTITION. Never partition by source_id or metric_name; high-cardinality partition keys create too many small parts. Keep those columns in ORDER BY instead.

Q: How do I make a metric from a Map queryable as fast as a real column?

Add a MATERIALIZED column that extracts it, e.g. ADD COLUMN cpu_user Float64 MATERIALIZED metrics['cpu_user']. It is computed on insert, so new rows gain full scalar-column performance. Existing rows are not backfilled automatically — run ALTER TABLE ... MATERIALIZE COLUMN cpu_user to populate historical data via a background mutation. Promote a metric this way once it becomes query-hot.