When a query violates PostgreSQL's grouping rules, you will see an error like:
ERROR: column "orders.customer_name" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT customer_name, SUM(amount) FROM orders;
^
SQLSTATE: 42803
The condition name is grouping_error, SQLSTATE 42803, and it means a column referenced in SELECT, ORDER BY, or HAVING is neither aggregated nor listed in the GROUP BY clause.
What This Error Means
SQLSTATE 42803 belongs to PostgreSQL error class 42 — "Syntax Error or Access Rule Violation". Despite the class name, this is a semantic error caught during query analysis, not a runtime error. PostgreSQL raises it during the planning phase before any data is touched, so no rows are read and no transaction state is affected. You can retry the corrected query immediately.
The rule PostgreSQL enforces is a core property of the relational model: when you use GROUP BY, the query collapses multiple rows into a single output row per group. Every column in SELECT, ORDER BY, and HAVING must be either (a) listed in GROUP BY — meaning it has the same value for all rows in the group — or (b) wrapped in an aggregate function (SUM, COUNT, MAX, MIN, AVG, etc.) that reduces the group to a single value. Any column that is neither creates an ambiguity: which row's value should PostgreSQL use? PostgreSQL refuses to guess, unlike some other databases.
PostgreSQL is stricter here than MySQL's default mode (prior to ONLY_FULL_GROUP_BY being enabled). If you are migrating queries from MySQL or SQLite, you will frequently encounter 42803 for queries those databases accepted silently — sometimes returning arbitrary values from the group.
Common Causes
Missing column in GROUP BY — The most frequent cause: a
SELECTlist includes a plain column alongside an aggregate, but that column is not listed inGROUP BY.-- Fails: customer_name is not grouped SELECT customer_name, SUM(amount) FROM orders GROUP BY customer_id;Selecting a non-key column from a joined table — Joining tables and selecting columns from the joined table without including them in
GROUP BY.-- Fails: p.product_name is not in GROUP BY SELECT p.product_name, COUNT(o.id) FROM orders o JOIN products p ON o.product_id = p.id GROUP BY o.product_id;ORDER BY on a non-aggregated, non-grouped column —
ORDER BYhas the same grouping requirement asSELECT.-- Fails: created_at is not grouped or aggregated SELECT customer_id, COUNT(*) FROM orders GROUP BY customer_id ORDER BY created_at;HAVING clause referencing a bare column — Using a raw column (not an aggregate) in
HAVINGthat is not inGROUP BY.-- Fails: status is not in GROUP BY SELECT customer_id, COUNT(*) FROM orders GROUP BY customer_id HAVING status = 'active';Subquery or CTE with implicit grouping — A correlated subquery that performs aggregation but accidentally references an outer column in a way PostgreSQL cannot resolve.
How to Fix grouping_error
Add the missing column to GROUP BY — The simplest fix when you genuinely want that column as a grouping dimension.
-- Fixed: add customer_name to GROUP BY SELECT customer_name, SUM(amount) FROM orders GROUP BY customer_id, customer_name;Use an aggregate function — If you only need one representative value from the group and all rows have the same value, or you need max/min/first, wrap the column in an aggregate.
-- Use MAX when all rows in the group share the same value SELECT customer_id, MAX(customer_name), SUM(amount) FROM orders GROUP BY customer_id;Use DISTINCT ON instead of GROUP BY — When the goal is deduplication rather than aggregation,
DISTINCT ONcan be clearer.SELECT DISTINCT ON (customer_id) customer_id, customer_name, amount FROM orders ORDER BY customer_id, created_at DESC;Use a subquery or CTE to separate aggregation from selection — Aggregate first, then join back for the non-aggregated columns.
WITH totals AS ( SELECT customer_id, SUM(amount) AS total_amount FROM orders GROUP BY customer_id ) SELECT c.customer_name, t.total_amount FROM totals t JOIN customers c ON t.customer_id = c.id;Fix ORDER BY to use only grouped or aggregated columns — Replace the bare column with an aggregate or add it to
GROUP BY.-- Fixed: order by the aggregate SELECT customer_id, COUNT(*), MAX(created_at) AS latest_order FROM orders GROUP BY customer_id ORDER BY latest_order DESC;
Additional Information
- SQLSTATE class
42errors in PostgreSQL include related grouping and syntax violations:42P20(windowing_error, raised when a window function is used in a context it is not allowed), and42803is specific toGROUP BYviolations. - PostgreSQL has enforced this standard since version 7.4. There is no compatibility mode to relax it — unlike MySQL where
ONLY_FULL_GROUP_BYcan be disabled. - ORMs such as ActiveRecord (Rails), SQLAlchemy, and Django ORM generally construct valid
GROUP BYclauses automatically when using their aggregation APIs. You are most likely to see 42803 when writing raw SQL via.execute(),connection.raw(), or similar escape hatches. - The error is raised at parse/analysis time, before execution. It does not consume any query resources and does not affect transaction state.
- PostgreSQL's
FILTERclause on aggregates (COUNT(*) FILTER (WHERE status = 'active')) is a clean alternative toHAVINGwhen filtering within aggregation, and can help avoid the need to add columns toGROUP BYjust to satisfyHAVING.
Frequently Asked Questions
Why does this query work in MySQL but fail in PostgreSQL?
MySQL historically defaulted to allowing non-deterministic grouping, picking an arbitrary row's value for ungrouped columns. PostgreSQL (and the SQL standard) require every selected column to be deterministically derivable from the group. MySQL added the ONLY_FULL_GROUP_BY SQL mode in version 5.7.5, which enforces the same rule — but it may not be enabled on older installations.
Can I suppress this error if I know all rows in the group have the same value?
Not directly — PostgreSQL has no hint mechanism for this. The idiomatic solution is to wrap the column in MAX() or MIN(), which is semantically correct when all values in the group are identical. Some teams also use ANY_VALUE() available as an extension; in standard PostgreSQL, MAX() is the conventional idiom.
Does this error affect my transaction?
No. SQLSTATE 42803 is raised during query analysis, before execution begins. Your transaction remains open and unaffected. You can correct the query and retry within the same transaction.
I'm using GROUP BY with a primary key — why am I still getting 42803?
If you group by a table's primary key, PostgreSQL does not automatically infer that all other columns of that table are functionally determined. You must either list them in GROUP BY or aggregate them. The SQL standard's concept of "functional dependency" (where grouping by a primary key implies all columns of that table are available) is supported in PostgreSQL since version 9.1 — but only when the primary key constraint is present and the query planner can detect the dependency. If the error persists, verify the constraint exists and is not deferrable.