Finding and Fixing Slow PostgreSQL Queries

Slow queries rarely announce themselves clearly. A query that takes 2 ms at baseline can take 30 seconds under concurrent load, after table bloat accumulates, or when statistics go stale. This guide covers every major tool and technique for identifying slow queries and understanding their causes, followed by the most common fixes.

Slow Query Logging

PostgreSQL does not log slow queries by default. The starting point is log_min_duration_statement, which logs any statement exceeding a time threshold.

# postgresql.conf

# Log queries taking longer than 1 second
log_min_duration_statement = 1000

# Log queries taking longer than 500ms (tighter, for tuning sessions)
log_min_duration_statement = 500

# Log every query (debugging only — extremely noisy in production)
log_min_duration_statement = 0

# Disabled (PostgreSQL default)
log_min_duration_statement = -1

This parameter does not require a server restart — apply with SELECT pg_reload_conf();. You can also set it per-session or per-role:

-- Temporary, for current session
SET log_min_duration_statement = 500;

-- Permanent, for a specific role
ALTER ROLE myapp SET log_min_duration_statement = 1000;

log_duration is a different parameter that logs the duration of every completed statement with no threshold. It is too noisy for production use. log_statement = 'all' logs the text of every statement. Neither is a substitute for log_min_duration_statement for slow query diagnosis.

For high-traffic servers on PostgreSQL 13+, log_min_duration_sample combined with log_statement_sample_rate logs only a random fraction of qualifying queries, giving representative data without flooding the log.

Recommended logging configuration

log_min_duration_statement = 1000     # Adjust to 200-500ms for detailed tuning
log_line_prefix = '%m [%p] %q%u@%d ' # Timestamp, PID, user, database
log_lock_waits = on                   # Log lock wait events
deadlock_timeout = 1s                 # Threshold for lock wait logging
log_checkpoints = on                  # Log checkpoint activity
log_temp_files = 0                    # Log any temp file creation (sort/hash spills)
log_destination = 'csvlog'            # Structured format for log parsers
logging_collector = on

`pg_stat_statements`: Aggregate Query Statistics

pg_stat_statements is the most useful tool for identifying slow queries in a running instance. It tracks execution statistics for every distinct query shape (normalized — literal values become $1, $2, etc.) across all executions since the last reset.

Enabling it

It requires shared memory allocated at startup, so it needs a server restart:

# postgresql.conf
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 10000    # Distinct query shapes to track
pg_stat_statements.track = all    # Track nested queries; 'top' is default
compute_query_id = on             # Required in PostgreSQL 14+
track_io_timing = on              # Enable for blk_read_time / blk_write_time columns

After restarting, create the extension once per database:

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

Finding the biggest load contributors

By total execution time — the queries consuming the most cumulative database time:

SELECT
    query,
    calls,
    round(total_exec_time::numeric, 2)  AS total_ms,
    round(mean_exec_time::numeric, 2)   AS mean_ms,
    round(
        100.0 * total_exec_time / sum(total_exec_time) OVER (),
        2
    ) AS pct_of_total,
    rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

By mean execution time — the slowest individual calls:

SELECT
    query,
    calls,
    round(mean_exec_time::numeric, 2)   AS mean_ms,
    round(max_exec_time::numeric, 2)    AS max_ms,
    round(stddev_exec_time::numeric, 2) AS stddev_ms
FROM pg_stat_statements
WHERE calls > 50
ORDER BY mean_exec_time DESC
LIMIT 10;

High stddev_exec_time relative to mean_exec_time signals inconsistent performance — the query sometimes runs fast and sometimes does not. This often indicates lock contention, autovacuum interference, or plan instability.

By disk reads — queries with poor cache hit ratios:

SELECT
    query,
    calls,
    shared_blks_read,
    shared_blks_hit,
    round(
        100.0 * shared_blks_hit / NULLIF(shared_blks_hit + shared_blks_read, 0),
        1
    ) AS cache_hit_pct,
    round(mean_exec_time::numeric, 2) AS mean_ms
FROM pg_stat_statements
WHERE shared_blks_hit + shared_blks_read > 1000
ORDER BY cache_hit_pct ASC
LIMIT 10;

By temp file usage — queries spilling to disk (insufficient work_mem):

SELECT
    query,
    calls,
    temp_blks_written,
    round(mean_exec_time::numeric, 2) AS mean_ms
FROM pg_stat_statements
WHERE temp_blks_written > 0
ORDER BY temp_blks_written DESC
LIMIT 10;

After a tuning change, reset the statistics to get a clean baseline:

SELECT pg_stat_statements_reset();

`pg_stat_activity`: Queries Running Right Now

While pg_stat_statements gives you historical aggregates, pg_stat_activity shows what every backend process is doing at this moment.

Finding long-running queries

SELECT
    pid,
    now() - query_start   AS duration,
    state,
    wait_event_type,
    wait_event,
    left(query, 120)       AS query
FROM pg_stat_activity
WHERE state != 'idle'
  AND query_start < now() - interval '30 seconds'
ORDER BY query_start ASC;

Detecting idle-in-transaction connections

A connection in the idle in transaction state has an open transaction but is not executing any SQL. This is dangerous: it holds locks and prevents autovacuum from reclaiming dead rows on affected tables.

SELECT
    pid,
    now() - xact_start AS transaction_duration,
    state,
    client_addr,
    application_name,
    left(query, 120) AS last_query
FROM pg_stat_activity
WHERE state IN ('idle in transaction', 'idle in transaction (aborted)')
ORDER BY xact_start ASC;

Terminate these automatically by setting a timeout:

# postgresql.conf
idle_in_transaction_session_timeout = 5min

Cancelling or terminating a backend

-- Cancel the current query (connection remains open)
SELECT pg_cancel_backend(pid);

-- Terminate the connection entirely
SELECT pg_terminate_backend(pid);

Wait event analysis

When backends are blocked — not running slowly, but waiting — wait events tell you what they are waiting for:

SELECT
    wait_event_type,
    wait_event,
    count(*) AS backends
FROM pg_stat_activity
WHERE state != 'idle'
GROUP BY wait_event_type, wait_event
ORDER BY backends DESC;

Key wait events:

Type	Event	Meaning
`Lock`	`relation`, `tuple`, `transactionid`	Lock contention
`IO`	`DataFileRead`	Disk read (cache miss)
`IO`	`WALWrite`	Waiting for WAL flush
`Client`	`ClientRead`	Waiting for client to send data

`EXPLAIN ANALYZE`: Diagnosing a Specific Query

Once you identify which query is slow, EXPLAIN ANALYZE shows the actual execution plan with real timing and row counts alongside the planner's estimates.

-- The standard form for performance diagnosis
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE customer_id = 42;

-- Full output including planning-time settings and WAL stats
EXPLAIN (ANALYZE, BUFFERS, VERBOSE, SETTINGS, WAL) SELECT * FROM orders WHERE customer_id = 42;

-- JSON format for visualizer tools
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) SELECT * FROM orders WHERE customer_id = 42;

EXPLAIN ANALYZE executes the query. For DML, wrap it in a transaction to roll back:

BEGIN;
EXPLAIN ANALYZE UPDATE orders SET status = 'shipped' WHERE id = 1;
ROLLBACK;

What to look for

Sequential scans on large tables with high filter selectivity:

Seq Scan on orders  (cost=0.00..9821.00 rows=1 width=44) (actual time=0.012..892.430 rows=234 loops=1)
  Filter: (customer_id = 42)
  Rows Removed by Filter: 4999766

Nearly 5 million rows read to return 234. This is a missing index.

Row estimation errors — when the planner estimated 1 row and found 234, it likely chose a suboptimal plan. Compare rows=Z (estimate) to rows=C (actual). A 10x or greater mismatch warrants investigation.

Sort spills to disk:

Sort Method: external merge  Disk: 48256kB

The sort did not fit in work_mem. Increase it and re-test.

Hash join batches > 1:

Hash  (actual time=245.3..245.3 rows=500000 loops=1)
  Buckets: 131072  Batches: 8  Memory Usage: 4096kB

Batches: 8 means the hash table spilled to disk 8 times. work_mem is too small for this join.

Nested Loop with high loop count — check actual time × loops for the inner node. If the inner node shows actual time=0.09 rows=1 loops=50000, total inner time is 4,500 ms.

See the EXPLAIN ANALYZE guide for full coverage of plan node types and reading the output.

`auto_explain`: Automatic plan capture in production

The auto_explain extension logs query plans for slow queries automatically, without modifying application code:

# postgresql.conf (requires restart)
shared_preload_libraries = 'pg_stat_statements,auto_explain'

auto_explain.log_min_duration = 1000   # Log plans for queries over 1 second
auto_explain.log_analyze = true        # Include actual timing (adds overhead)
auto_explain.log_buffers = true        # Include buffer statistics
auto_explain.log_format = json         # JSON format for visualizer tools

Warning: log_analyze = true instruments every qualifying query with per-node timing. This adds measurable overhead on high-throughput systems. Use log_timing = false to capture row counts without timing cost, or set sample_rate (0.0–1.0) to capture a fraction of qualifying queries.

Autovacuum and Table Bloat

PostgreSQL's MVCC model retains old row versions ("dead tuples") after updates and deletes until autovacuum reclaims them. When autovacuum cannot keep up with write volume, dead tuples accumulate into table bloat.

How bloat causes slow queries

Sequential scans read dead tuples alongside live rows, inflating I/O
Indexes grow larger as they retain entries pointing to dead heap rows
More data to read means more buffer cache pressure and longer scans
Bloat can cause 2–10x slowdowns on queries scanning large table fractions

Checking dead tuple accumulation

SELECT
    schemaname,
    relname AS table_name,
    n_live_tup,
    n_dead_tup,
    round(
        n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100,
        2
    ) AS dead_pct,
    last_autovacuum,
    last_autoanalyze,
    pg_size_pretty(pg_total_relation_size(schemaname || '.' || relname)) AS total_size
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 20;

Tables with dead_pct > 10% need attention. For a more precise measurement, use the pgstattuple extension:

CREATE EXTENSION IF NOT EXISTS pgstattuple;

-- Approximate (fast — no full scan)
SELECT * FROM pgstattuple_approx('orders');

-- Exact (full table scan — can be slow)
SELECT * FROM pgstattuple('orders');

Running VACUUM manually

-- Reclaim dead tuple space
VACUUM orders;

-- Reclaim dead tuples and refresh planner statistics in one pass
VACUUM ANALYZE orders;

VACUUM FULL reclaims on-disk space but acquires an exclusive lock that blocks all reads and writes. For production tables where downtime is not acceptable, use pg_repack or pg_squeeze instead.

Tuning autovacuum for high-churn tables

The default autovacuum_vacuum_scale_factor = 0.2 triggers vacuum when 20% of rows are dead. On a 50-million-row table, that is 10 million dead tuples before vacuum runs. Tune per-table for tables with heavy writes:

ALTER TABLE orders SET (
    autovacuum_vacuum_scale_factor = 0.01,  -- Trigger at 1% dead rows
    autovacuum_vacuum_threshold = 1000,
    autovacuum_analyze_scale_factor = 0.005,
    autovacuum_vacuum_cost_delay = 2        -- ms; lower = faster vacuum (more I/O)
);

Missing and Unused Indexes

Identifying missing indexes

pg_stat_user_tables tracks cumulative scan counts. High sequential scan counts on large tables often signal missing indexes:

SELECT
    schemaname,
    relname AS table_name,
    seq_scan,
    idx_scan,
    n_live_tup AS live_rows,
    round(
        seq_scan::numeric / NULLIF(seq_scan + idx_scan, 0) * 100,
        2
    ) AS seq_scan_pct,
    pg_size_pretty(pg_relation_size(schemaname || '.' || relname)) AS size
FROM pg_stat_user_tables
WHERE n_live_tup > 10000
  AND seq_scan > 100
ORDER BY seq_scan * n_live_tup DESC
LIMIT 20;

High seq_scan_pct on tables with large n_live_tup are candidates for new indexes. Cross-reference with pg_stat_statements to identify which specific queries are driving those scans.

Identifying unused indexes

Unused indexes waste disk space and add overhead to every write operation:

SELECT
    schemaname,
    tablename,
    indexname,
    idx_scan AS times_used,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexrelname NOT LIKE 'pg_toast_%'
  AND indexrelid NOT IN (
      SELECT conindid FROM pg_constraint WHERE contype IN ('p', 'u')
  )
ORDER BY pg_relation_size(indexrelid) DESC;

Before dropping any index, verify how long statistics have been accumulating — an index with idx_scan = 0 is safe to drop only if the stats cover a representative period of production traffic:

SELECT datname, stats_reset
FROM pg_stat_database
WHERE datname = current_database();

Index types

B-tree (default): Equality, range, prefix LIKE, IS NULL, ORDER BY. Use for almost all cases.

BRIN: Stores min/max per block range. Tiny footprint. Ideal for append-only time-series tables where rows are inserted in roughly physical order. A BRIN index on a billion-row table can be under 1 MB.

CREATE INDEX idx_events_ts ON events USING BRIN(occurred_at);

GIN: Arrays, jsonb, tsvector (full-text search), trigrams (pg_trgm for LIKE '%pattern%').

-- JSONB containment
CREATE INDEX idx_events_metadata ON events USING GIN(metadata);

-- LIKE '%pattern%' (requires pg_trgm)
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX idx_users_name_trgm ON users USING GIN(name gin_trgm_ops);

GiST: Geometric types, PostGIS, range type exclusion constraints.

Partial indexes: Index only the rows matching a WHERE condition. Smaller, faster to maintain, and often allows the planner to use the index where a full index would not be selective enough.

-- Index only pending jobs, not completed ones
CREATE INDEX idx_jobs_pending ON jobs(created_at)
WHERE status = 'pending';

Expression indexes: Index the result of a function or expression.

-- Case-insensitive email lookup
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

Covering indexes (PostgreSQL 11+): INCLUDE adds columns to index leaf pages without making them part of the key, enabling Index Only Scans when those columns appear in the SELECT list.

CREATE INDEX idx_orders_user ON orders(user_id)
INCLUDE (status, total_amount, created_at);

Always use CONCURRENTLY in production to avoid blocking writes:

CREATE INDEX CONCURRENTLY idx_orders_customer_id ON orders(customer_id);

Lock Contention

Lock contention causes latency spikes that are unrelated to query complexity — a fast query can take 30 seconds if it is waiting for a lock. log_lock_waits = on logs every lock wait exceeding deadlock_timeout (default: 1s) to the PostgreSQL log.

Finding blocked queries

SELECT
    blocked.pid                        AS blocked_pid,
    left(blocked.query, 100)           AS blocked_query,
    now() - blocked.query_start        AS blocked_duration,
    blocking.pid                       AS blocking_pid,
    left(blocking.query, 100)          AS blocking_query,
    blocking.state                     AS blocking_state
FROM pg_stat_activity blocked
JOIN pg_stat_activity blocking
    ON blocking.pid = ANY(pg_blocking_pids(blocked.pid))
WHERE pg_blocking_pids(blocked.pid) IS DISTINCT FROM '{}'::int[]
ORDER BY blocked_duration DESC;

pg_blocking_pids(pid) (PostgreSQL 9.6+) returns the PIDs blocking a given backend. Avoid calling it in tight monitoring loops — it acquires a lock on the lock manager.

Preventing runaway lock waits

-- For a session
SET lock_timeout = '5s';

-- Within a specific transaction
BEGIN;
SET LOCAL lock_timeout = '2s';
UPDATE orders SET status = 'shipped' WHERE id = 999;
COMMIT;

Key Configuration Settings

`work_mem`

The maximum memory per sort or hash operation per query. A complex query may use this budget several times simultaneously, and each parallel worker gets its own allocation.

Default is 4 MB — appropriate for small OLTP queries but too small for sorting or joining large tables.

work_mem = 32MB    # Conservative default for OLTP workloads

Spills to disk appear in EXPLAIN as Sort Method: external merge Disk: or Hash Batches: N > 1. Increase work_mem for those sessions or roles:

SET work_mem = '256MB';          -- Session-level
ALTER ROLE reporting SET work_mem = '256MB';  -- Role-level

Setting it globally too high risks OOM when many connections run concurrent operations. Start conservatively and increase per-role.

`effective_cache_size`

Not a memory allocation — a hint to the query planner about the total memory available for caching (shared buffers + OS page cache). A correct value helps the planner favor index scans over sequential scans on large tables.

effective_cache_size = 12GB   # 75% of RAM on a dedicated 16GB server

If set too low, the planner under-estimates index scan efficiency and may choose sequential scans instead.

`random_page_cost`

The planner's estimate of random I/O cost relative to sequential I/O. Default is 4.0, designed for spinning HDDs. For SSDs:

random_page_cost = 1.1   # Local NVMe SSD
random_page_cost = 1.5   # Network-attached SSD (EBS, cloud block storage)

Setting this incorrectly on HDDs causes the planner to over-use indexes at the cost of excessive random I/O.

`enable_seqscan` (debugging only)

Never set globally. Use per-session to force the planner away from sequential scans and reveal alternative plans:

SET enable_seqscan = off;
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 42;
SET enable_seqscan = on;

If this reveals a faster plan, the real fix is stale statistics, a missing index, or misconfigured planner settings.

Common Causes and Fixes

Missing index

The most common cause. Identify via pg_stat_user_tables.seq_scan and pg_stat_statements. Add an appropriate index type and run ANALYZE afterward.

Non-sargable predicates

Wrapping a column in a function prevents index use:

-- BAD: cannot use an index on created_at
SELECT * FROM orders WHERE DATE(created_at) = '2025-01-15';

-- GOOD: range predicate on the raw column uses the index
SELECT * FROM orders
WHERE created_at >= '2025-01-15' AND created_at < '2025-01-16';

When filtering on a function result is unavoidable, create an expression index on that exact expression.

Stale statistics

After large bulk loads or deletions, the planner's estimates may be far off. Run ANALYZE to refresh statistics:

ANALYZE orders;

For columns with skewed distributions, increase the per-column statistics target:

ALTER TABLE orders ALTER COLUMN status SET STATISTICS 500;
ANALYZE orders;

N+1 queries

Appears in pg_stat_statements as a query with very high calls and low mean_exec_time. Fix at the application layer: use JOINs, IN (...) preloading, or ORM eager loading.

Table bloat

Detected via pg_stat_user_tables.n_dead_tup. Fix by tuning autovacuum or running VACUUM ANALYZE manually. For severe bloat on large tables without maintenance windows, use pg_repack.

Work_mem too small (sort/hash spills)

Detected via pg_stat_statements.temp_blks_written or Sort Method: external merge in EXPLAIN output. Increase work_mem for affected roles.

Idle-in-transaction connections

Detected via pg_stat_activity where state = 'idle in transaction'. These hold locks that block other queries. Set idle_in_transaction_session_timeout and fix the application to always close transactions promptly.

Hot standby query conflicts on replicas

On read replicas, queries are cancelled when the primary sends WAL that conflicts with data the replica is reading. Symptom: ERROR: canceling statement due to conflict with recovery.

# On the replica
max_standby_streaming_delay = 30s
hot_standby_feedback = on

Too many connections

At hundreds to thousands of connections, the memory and CPU overhead of idle backend processes degrades cluster-wide performance. Each PostgreSQL backend consumes 5–10 MB even when idle. The fix is a connection pooler — see PostgreSQL Connection Pooling with PgBouncer.

pgBadger for Log Analysis

pgBadger generates detailed HTML reports from PostgreSQL log files. It is useful for analyzing slow query patterns across hours or days of historical logs without querying the live database. Version 13.0 was released in December 2024.

Setup

pgBadger requires structured logs:

# postgresql.conf
log_destination = 'csvlog'
logging_collector = on
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '
log_min_duration_statement = 200
log_checkpoints = on
log_lock_waits = on
log_temp_files = 0
log_autovacuum_min_duration = 250ms

Usage

# Install (Debian/Ubuntu)
sudo apt install pgbadger

# Generate HTML report from a log file
pgbadger /var/log/postgresql/postgresql-2025-01-15.log -o report.html

# Parallel processing for large logs
pgbadger --jobs 4 /var/log/postgresql/*.log.gz -o report.html

# CSV log format (recommended — cleaner parsing)
pgbadger --format csv /var/log/postgresql/postgresql-*.csv -o report.html

pgBadger reports include: slowest queries by max and total time, most-called queries, queries generating the most temporary files, lock wait events, deadlocks, and checkpoint activity.

Diagnostic Workflow

When slow queries are reported:

Check pg_stat_activity for currently running slow or blocked queries and wait events.
Query pg_stat_statements ordered by total_exec_time DESC to find the biggest load contributors since the last stats reset.
Run EXPLAIN (ANALYZE, BUFFERS) on the top offending queries. Look for sequential scans on large tables, sort spills, hash join batches > 1, and row estimation errors.
Check pg_stat_user_tables for dead tuple bloat (dead_pct) and missing index signals (seq_scan counts).
Check pg_stat_user_indexes for unused indexes consuming write overhead.
Review logs for lock wait events, deadlocks, and temp file creation.
Apply fixes: add indexes, run VACUUM ANALYZE, tune work_mem, tune autovacuum, add a connection pooler.