Reading PostgreSQL EXPLAIN Execution Plans: A Practical Guide

Q: What is the difference between EXPLAIN and EXPLAIN ANALYZE in PostgreSQL?

EXPLAIN shows the planner's chosen plan with estimated costs and row counts without running the statement. EXPLAIN ANALYZE actually executes the statement and adds real timings, actual row counts, and loop counts, so you can compare estimates against reality. Use EXPLAIN ANALYZE for troubleshooting and plain EXPLAIN when you only need the shape of the plan.

Q: Does EXPLAIN ANALYZE modify data?

Yes, for data-modifying statements. EXPLAIN ANALYZE runs the query, so an INSERT , UPDATE , DELETE , or MERGE will actually write rows. Wrap it in BEGIN; ... ROLLBACK; to analyze the plan without persisting the changes. Plain SELECT only costs the execution time.

Q: How do I read the cost numbers in an EXPLAIN plan?

Each node shows cost=startup..total . The startup cost is the work before the first row is returned, the total cost is the work to return all rows, and both are in abstract units where 1.0 is roughly one sequential page read - not milliseconds. The planner picks the plan with the lowest total cost for the whole tree.

Q: What does it mean when estimated rows differ greatly from actual rows?

A large gap between the rows estimate and the actual ... rows count means the planner's statistics are stale, the sample is too small, or columns are correlated and the planner assumed independence. Run ANALYZE , raise the per-column statistics target, or add extended statistics with CREATE STATISTICS . Bad estimates lead directly to bad join and scan choices.

Q: Is BUFFERS enabled by default in EXPLAIN ANALYZE?

Starting in PostgreSQL 18, EXPLAIN ANALYZE enables BUFFERS by default. On PostgreSQL 17 and earlier you must request it explicitly with EXPLAIN (ANALYZE, BUFFERS) . The BUFFERS output reports per-node shared/local hits and disk reads, which is the quickest way to see whether a node is served from cache or from disk.

Q: What is the loops value in EXPLAIN ANALYZE output?

loops is the number of times a plan node was executed. It is commonly greater than 1 on the inner side of a Nested Loop , which runs once per outer row. The reported actual time and rows are per loop, so multiply them by loops to get the node's total contribution.

EXPLAIN shows the execution plan the PostgreSQL planner chose for a statement: which scan method each table uses (sequential scan, index scan, and so on), the join algorithms that combine them, and the planner's cost and row estimates for every step. It is the first tool to reach for when a query is slow. EXPLAIN ANALYZE goes further and actually runs the query, adding real timings and row counts so you can compare the planner's guesses against reality.

EXPLAIN vs EXPLAIN ANALYZE

Plain EXPLAIN only plans the statement. It returns the planner's estimates without executing anything, so it is instant and safe to run on any query:

-- Estimated plan only - does not run the query
EXPLAIN SELECT * FROM orders WHERE customer_id = 42;

EXPLAIN ANALYZE executes the statement and reports the real time, row counts, and loop counts alongside the estimates. That makes it the better choice for troubleshooting, with one warning: it runs the query for real. For SELECT that only costs you the runtime. For INSERT, UPDATE, DELETE, or MERGE, the rows are actually written. Wrap data-modifying statements in a transaction and roll back so the analysis leaves no trace:

BEGIN;
EXPLAIN ANALYZE UPDATE orders SET status = 'shipped' WHERE id = 99;
ROLLBACK;

Useful options go inside parentheses. BUFFERS reports per-node block I/O (cache hits versus disk reads), which is the fastest way to see where reads come from. Starting in PostgreSQL 18, EXPLAIN ANALYZE turns BUFFERS on by default; on PostgreSQL 17 and earlier you must request it. FORMAT JSON (or YAML/XML) emits a structured plan that visualization tools consume:

-- Full diagnostic on PG17 and earlier
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE customer_id = 42;

-- Structured output for tooling
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) SELECT * FROM orders WHERE customer_id = 42;

Common Plan Node Types

A plan is a tree of nodes. Each node consumes rows from its children and passes rows to its parent; the topmost node produces the final result. These are the nodes you see most often:

Node	What it means
`Seq Scan`	Reads every row of the table. Fine for small tables; a red flag on large ones with a selective filter.
`Index Scan`	Walks an index, then fetches matching rows from the heap. Good when the filter is selective.
`Index Only Scan`	Answers entirely from the index without touching the heap. Requires a covering index and a recently vacuumed visibility map.
`Bitmap Index Scan` + `Bitmap Heap Scan`	Builds a bitmap of matching pages from one or more indexes, then reads those pages in physical order. Used for medium-selectivity filters.
`Nested Loop`	For each row from the outer input, probes the inner input. Fast when the outer side is small and the inner side is indexed.
`Hash Join`	Builds a hash table on the smaller input, then probes it. Strong for large, unindexed equi-joins.
`Merge Join`	Merges two inputs already sorted on the join key. Efficient for large pre-sorted sets.
`Sort`	Orders rows for `ORDER BY`, `Merge Join`, or `GROUP BY`. Spills to disk if it exceeds `work_mem`.
`Aggregate` / `HashAggregate` / `GroupAggregate`	Computes `count`, `sum`, `GROUP BY`, and similar.
`Gather`	Collects rows from parallel worker processes.

Choosing the right join strategy and getting the planner to use an index that already exists are the two changes that most often turn a Seq Scan plan into a fast one.

Reading the Numbers: Cost, Rows, Width

Every node carries an estimate block, for example cost=0.29..8.31 rows=1 width=237. The two cost numbers are startup cost and total cost, expressed in abstract units where 1.0 is roughly one sequential page read (set by seq_page_cost). They are not milliseconds.

Startup cost is the work before the first row is returned. A Sort has a high startup cost because it must read all input before emitting anything; a Seq Scan has near-zero startup.
Total cost is the work to return all rows. The planner picks the plan with the lowest total cost for the whole tree.
rows is the estimated number of rows the node emits.
width is the estimated average row size in bytes, which drives memory and sort estimates.

With EXPLAIN ANALYZE, each node also reports (actual time=0.015..0.042 rows=128 loops=1). The two actual time figures are time-to-first-row and time-to-last-row, in milliseconds, per loop. loops is how many times the node ran - common on the inner side of a Nested Loop, where the reported per-row times are per loop and you multiply by loops for the total.

EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 42;
-- Seq Scan on orders  (cost=0.00..1834.00 rows=131 width=237)
--                     (actual time=0.211..18.4 rows=128 loops=1)
--   Filter: (customer_id = 42)
--   Rows Removed by Filter: 99872

Rows Removed by Filter: 99872 is the tell here: the scan read 100,000 rows to return 128. An index on customer_id would convert this to an Index Scan.

Estimated vs Actual: Spotting Stale Statistics

The planner chooses a plan from estimates derived from table statistics in pg_statistic, refreshed by ANALYZE and autovacuum. When estimates are close to actuals, the plan is usually sound. When they diverge by an order of magnitude, the planner is working from a bad map and may pick a Nested Loop where a Hash Join would win, or a Seq Scan over a usable index.

Compare the rows estimate with actual ... rows on each node. A node estimating rows=10 but producing rows=10000 points to stale or insufficient statistics. The usual fixes:

-- Refresh statistics for one table
ANALYZE orders;

-- Increase sample detail for a skewed column, then re-analyze
ALTER TABLE orders ALTER COLUMN status SET STATISTICS 1000;
ANALYZE orders;

Correlated columns are a separate trap: the planner assumes independence, so a filter like WHERE city = 'X' AND country = 'Y' underestimates badly when the columns are linked. Extended statistics fix that case:

CREATE STATISTICS orders_geo (dependencies) ON city, country FROM orders;
ANALYZE orders;

If the divergence persists after analyzing, the cause is often the query shape itself - an expression on an indexed column, an implicit type cast, or a non-sargable predicate that stops the index from being used.

Visualizing and Capturing Plans

Text plans for queries that join many tables become unreadable fast. Several free tools render the JSON output as an annotated tree and highlight the most expensive nodes:

explain.dalibo.com - the open-source PEV2 visualizer; paste the plan (and optionally the SQL) for a color-coded node tree.
explain.depesz.com - shows the plan as a table and flags rows where estimate and actual diverge.
pgMustard - scores a plan and gives specific tuning suggestions.

Feed these tools EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) output for the richest view. The harder problem in production is capturing the plan of the query that was actually slow, when it was slow, rather than re-running it later against a warm cache and different data. Pulse records execution plans automatically as queries run, links each plan to the statement and the surrounding metrics, and flags the regressions - a new Seq Scan, an estimate that drifted away from actuals, a sort spilling to disk - then recommends a concrete fix for a human to approve.

Frequently Asked Questions

Q: What is the difference between EXPLAIN and EXPLAIN ANALYZE in PostgreSQL?
A: EXPLAIN shows the planner's chosen plan with estimated costs and row counts without running the statement. EXPLAIN ANALYZE actually executes the statement and adds real timings, actual row counts, and loop counts, so you can compare estimates against reality. Use EXPLAIN ANALYZE for troubleshooting and plain EXPLAIN when you only need the shape of the plan.

Q: Does EXPLAIN ANALYZE modify data?
A: Yes, for data-modifying statements. EXPLAIN ANALYZE runs the query, so an INSERT, UPDATE, DELETE, or MERGE will actually write rows. Wrap it in BEGIN; ... ROLLBACK; to analyze the plan without persisting the changes. Plain SELECT only costs the execution time.

Q: How do I read the cost numbers in an EXPLAIN plan?
A: Each node shows cost=startup..total. The startup cost is the work before the first row is returned, the total cost is the work to return all rows, and both are in abstract units where 1.0 is roughly one sequential page read - not milliseconds. The planner picks the plan with the lowest total cost for the whole tree.

Q: Why does my plan use a Seq Scan instead of an Index Scan?
A: PostgreSQL uses a sequential scan when it estimates that reading the whole table is cheaper than the index plus heap fetches, which is correct for small tables or non-selective filters. If an index should win, check for stale statistics (run ANALYZE), a non-sargable predicate such as a function or type cast on the indexed column, or a filter that matches most of the table.

Q: What does it mean when estimated rows differ greatly from actual rows?
A: A large gap between the rows estimate and the actual ... rows count means the planner's statistics are stale, the sample is too small, or columns are correlated and the planner assumed independence. Run ANALYZE, raise the per-column statistics target, or add extended statistics with CREATE STATISTICS. Bad estimates lead directly to bad join and scan choices.

Q: Is BUFFERS enabled by default in EXPLAIN ANALYZE?
A: Starting in PostgreSQL 18, EXPLAIN ANALYZE enables BUFFERS by default. On PostgreSQL 17 and earlier you must request it explicitly with EXPLAIN (ANALYZE, BUFFERS). The BUFFERS output reports per-node shared/local hits and disk reads, which is the quickest way to see whether a node is served from cache or from disk.

Q: What is the loops value in EXPLAIN ANALYZE output?
A: loops is the number of times a plan node was executed. It is commonly greater than 1 on the inner side of a Nested Loop, which runs once per outer row. The reported actual time and rows are per loop, so multiply them by loops to get the node's total contribution.

EXPLAIN ANALYZE in PostgreSQL: Deeper dive into ANALYZE, BUFFERS, and the per-node diagnostic options.
Creating Indexes in PostgreSQL: Add the right index to turn a Seq Scan into an Index Scan.
Why Is My PostgreSQL Index Not Used: Predicates and casts that stop the planner from using an index.
PostgreSQL Join Strategies: When the planner picks Nested Loop, Hash Join, or Merge Join, and how to influence it.
Debugging Slow PostgreSQL Queries: Find the queries worth running EXPLAIN on in the first place.
PostgreSQL Performance Tuning: Configuration and query-level levers behind the plans.
Reading MySQL EXPLAIN Output: The equivalent workflow and node concepts in MySQL.