The "DB::Exception: Ambiguous column name" error in ClickHouse appears when the query references a column whose name exists in more than one source, and ClickHouse cannot determine which one you mean. This situation frequently arises after JOIN operations where both tables share a column name, or in subqueries that expose columns clashing with outer query names. The error code is AMBIGUOUS_COLUMN_NAME.
Impact
The query is rejected before any data is read. This is a query-analysis-phase error. If queries are generated dynamically, this may intermittently appear when table schemas change and introduce overlapping column names.
Common Causes
- Joining tables with identically named columns -- common columns like
id,timestamp,type, orstatusappear in many tables. Without qualification, ClickHouse cannot resolve them. - Using UNION ALL or subqueries that produce duplicate column names -- a subquery might expose a column that collides with a column from the outer query context.
- Self-joins without unique aliases -- joining a table to itself means every column exists twice.
- Schema changes introducing new columns -- a previously unambiguous query can break when an ALTER TABLE adds a column whose name matches one in a joined table.
- SELECT * from joined tables -- pulling all columns from both tables naturally surfaces naming conflicts.
Troubleshooting and Resolution Steps
Add table aliases and qualify all column references:
-- Error: 'status' exists in both tables SELECT status FROM orders JOIN shipments ON orders.id = shipments.order_id; -- Fixed SELECT o.status FROM orders AS o JOIN shipments AS s ON o.id = s.order_id;Rename columns in subqueries to avoid collisions:
SELECT t.id, agg.total FROM transactions AS t JOIN ( SELECT account_id, sum(amount) AS total FROM transactions GROUP BY account_id ) AS agg ON t.account_id = agg.account_id;Use USING for natural join columns to avoid duplication:
-- USING resolves 'user_id' as a single column SELECT * FROM orders JOIN returns USING (user_id, product_id);*Explicitly list columns instead of SELECT :
SELECT o.id AS order_id, o.amount, c.name AS customer_name FROM orders AS o JOIN customers AS c ON o.customer_id = c.id;Check for recent schema changes if the query previously worked:
DESCRIBE TABLE orders; DESCRIBE TABLE shipments;Compare column names to identify newly overlapping names.
Best Practices
- Qualify every column reference with a table alias in any query involving more than one table.
- Use explicit column lists in SELECT, never
SELECT *in production code. - Adopt column naming conventions that include the entity name (e.g.,
order_status,shipment_status) to minimize collisions. - When modifying table schemas, review existing queries that join the modified table to check for new ambiguities.
Frequently Asked Questions
Q: What is the difference between AMBIGUOUS_COLUMN_NAME and AMBIGUOUS_IDENTIFIER?
A: AMBIGUOUS_COLUMN_NAME is used by the legacy query analyzer when a column name is ambiguous. AMBIGUOUS_IDENTIFIER is the equivalent error from the newer analyzer. Both have the same root cause and the same fix: qualify column names with table aliases.
Q: Does this error occur with ARRAY JOIN?
A: It can, if the ARRAY JOIN produces a column whose name collides with an existing column in the table. Use an alias for the array-joined column: ARRAY JOIN arr AS arr_element.
Q: Can settings affect this error?
A: The joined_subquery_requires_alias setting (enabled by default) requires subqueries in JOIN to have aliases, which helps prevent ambiguity. Disabling it can make this error more likely.
Q: Will USING prevent this error entirely?
A: USING resolves ambiguity only for the columns listed in it. Other columns shared between the two tables will still be ambiguous if referenced without qualification.