The argMax function in ClickHouse is an aggregate that returns the value of one column (arg) corresponding to the maximum value of another column (val) within a group. It is the canonical way to fetch "the latest record per user", "the highest-priced product per category", or any "value at the maximum" pattern, without subqueries or window functions. argMax skips NULL values in both arguments. When val ties across multiple rows, the returned arg is non-deterministic.
Syntax
argMax(arg, val)
val can be a tuple to break ties across multiple columns lexicographically. See the official ClickHouse argMax reference.
Parameters
| Name | Type | Description | Required |
|---|---|---|---|
arg |
Any type | The value to return. Most commonly an ID column, a label, or a tuple of fields. | Yes |
val |
Comparable: Int*, Float*, Date, DateTime, String, or Tuple(...) |
The value to maximize. The arg corresponding to the maximum is returned. |
Yes |
Return type: matches the type of arg. Available since v1.1.0.
Examples
Latest event per user
SELECT
user_id,
argMax(event_type, event_time) AS last_event,
max(event_time) AS last_seen
FROM events
GROUP BY user_id;
For each user, returns the event_type of the most recent event. Pairing with max(event_time) is a common pattern.
Highest-value row across multiple columns
SELECT
category,
argMax((product_id, name, price), price) AS top_product
FROM products
GROUP BY category;
Wrap multiple arg columns in a tuple to return them together. The result is Tuple(UUID, String, Decimal); access fields with top_product.1, top_product.2, etc.
Breaking ties with a tuple val
SELECT
user_id,
argMax(session_id, (event_time, session_id)) AS latest_session
FROM events
GROUP BY user_id;
When two rows share the same event_time, ClickHouse compares the tuple lexicographically and picks the row with the larger session_id, making the result deterministic.
Using argMax in a deduplication query
SELECT
id,
argMax(name, updated_at) AS name,
argMax(email, updated_at) AS email,
argMax(version, updated_at) AS version
FROM customers
GROUP BY id;
Collapses multiple versioned rows per id into the newest version. This is faster than a self-join and works well over ReplacingMergeTree tables before FINAL is applied.
Inside a materialized view with -State
CREATE MATERIALIZED VIEW user_latest_event_mv
ENGINE = AggregatingMergeTree
ORDER BY user_id
AS SELECT
user_id,
argMaxState(event_type, event_time) AS last_event_state
FROM events
GROUP BY user_id;
argMaxState stores the aggregation state. Read back with argMaxMerge(last_event_state). This is the standard pattern for incrementally maintaining "latest value per key" aggregates. See AggregatingMergeTree.
Common Issues and Gotchas
- Ties are non-deterministic. If multiple rows share the maximum
val, whichargis returned is unspecified and may change across queries. Break ties by passing aTupleasval. - NULLs in
argorvalare skipped. IfvalisNULLfor every row in a group,argMaxreturnsNULL. If only somevalvalues areNULL, those rows are ignored. - Don't confuse
argMax(arg, val)withmax(arg).maxreturns the largest value of a single column.argMaxreturns the value of one column where another column is largest. They answer different questions. - Tuple
argsyntax requires parentheses.argMax(a, b, c, ...)is not valid; useargMax((a, b, c), val)to return multiple columns. - The aggregating-state form is
argMaxState, notargMaxIfState. When stacking combinators, the order is-If, then-State/-Merge: e.g.argMaxIfState(arg, val, cond).
Performance Notes
argMax runs in a single pass over the data with O(1) memory per group: it tracks the current maximum val and its associated arg. It is roughly as fast as max on the val column alone. For very wide arg tuples, the per-group state is proportional to the tuple size, which can dominate memory on high-cardinality grouping keys. For "latest row per key" patterns over append-only data, ReplacingMergeTree with FINAL or AggregatingMergeTree with argMaxState are usually faster than ad-hoc argMax queries.
Frequently Asked Questions
Q: What's the difference between max and argMax in ClickHouse?
A: max(col) returns the largest value of a single column. argMax(arg, val) returns the value of arg corresponding to the maximum of val. Use argMax when you need an associated value (an ID, a label, a timestamp) at the maximum of some other column.
Q: How does argMax handle NULL values?
A: argMax skips rows where either arg or val is NULL. If every row in a group has NULL for val, the function returns NULL. When the maximum non-NULL val corresponds to a NULL arg, that row is skipped and the next-highest non-NULL row is used.
Q: What happens with argMax when there are multiple rows with the same maximum value?
A: The returned arg is non-deterministic - any of the tied rows may win, and the choice can change between query runs. To make ties deterministic, pass a tuple as val: argMax(arg, (val, tiebreaker)).
Q: Is there an equivalent function for finding the minimum?
A: Yes, `argMin(arg, val)` returns the arg corresponding to the minimum of val. Same syntax, same NULL handling, same tuple support.
Q: Can argMax return multiple columns at once?
A: Yes, wrap them in a tuple: argMax((id, name, price), updated_at). The result is a Tuple; access fields with .1, .2, .3 or destructure into separate columns.
Q: How do I use argMax in a materialized view?
A: Use argMaxState(arg, val) in the SELECT of an AggregatingMergeTree-backed materialized view, and read it back with argMaxMerge(state). This pattern incrementally maintains "latest value per key" without re-scanning the source table.
Pulse Monitors argMax-Heavy ClickHouse Workloads
"Latest record per key" queries built on argMax are common in event sourcing, user-profile lookups, and CDC pipelines. They are sensitive to merge lag and replica skew. Pulse provides AI-powered monitoring for ClickHouse, with automated root-cause analysis that correlates aggregate-query slowdowns to part counts, merge backlog, and replication state.
Related Reading
- ClickHouse argMin Function: the minimum counterpart of
argMax - ClickHouse max Function: the maximum of a single column
- ClickHouse any Function: pick any value from a group (no ordering)
- ClickHouse AggregatingMergeTree: incremental aggregation engine for
argMaxState/argMaxMerge - ClickHouse ReplacingMergeTree: alternative engine for "latest row per key" patterns
- ClickHouse MergeTree: foundation engine that determines primary-key pruning during
argMax - ClickHouse materialized view: how
-State/-Mergeaggregates compose with views