Slow queries rarely announce themselves clearly. A query that takes 2 ms at baseline can take 30 seconds under concurrent load, after table bloat accumulates, or when statistics go stale. This guide covers every major tool and technique for identifying slow queries and understanding their causes, followed by the most common fixes.
Slow Query Logging
PostgreSQL does not log slow queries by default. The starting point is log_min_duration_statement, which logs any statement exceeding a time threshold.
# postgresql.conf
# Log queries taking longer than 1 second
log_min_duration_statement = 1000
# Log queries taking longer than 500ms (tighter, for tuning sessions)
log_min_duration_statement = 500
# Log every query (debugging only — extremely noisy in production)
log_min_duration_statement = 0
# Disabled (PostgreSQL default)
log_min_duration_statement = -1
This parameter does not require a server restart — apply with SELECT pg_reload_conf();. You can also set it per-session or per-role:
-- Temporary, for current session
SET log_min_duration_statement = 500;
-- Permanent, for a specific role
ALTER ROLE myapp SET log_min_duration_statement = 1000;
log_duration is a different parameter that logs the duration of every completed statement with no threshold. It is too noisy for production use. log_statement = 'all' logs the text of every statement. Neither is a substitute for log_min_duration_statement for slow query diagnosis.
For high-traffic servers on PostgreSQL 13+, log_min_duration_sample combined with log_statement_sample_rate logs only a random fraction of qualifying queries, giving representative data without flooding the log.
Recommended logging configuration
log_min_duration_statement = 1000 # Adjust to 200-500ms for detailed tuning
log_line_prefix = '%m [%p] %q%u@%d ' # Timestamp, PID, user, database
log_lock_waits = on # Log lock wait events
deadlock_timeout = 1s # Threshold for lock wait logging
log_checkpoints = on # Log checkpoint activity
log_temp_files = 0 # Log any temp file creation (sort/hash spills)
log_destination = 'csvlog' # Structured format for log parsers
logging_collector = on
pg_stat_statements: Aggregate Query Statistics
pg_stat_statements is the most useful tool for identifying slow queries in a running instance. It tracks execution statistics for every distinct query shape (normalized — literal values become $1, $2, etc.) across all executions since the last reset.
Enabling it
It requires shared memory allocated at startup, so it needs a server restart:
# postgresql.conf
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 10000 # Distinct query shapes to track
pg_stat_statements.track = all # Track nested queries; 'top' is default
compute_query_id = on # Required in PostgreSQL 14+
track_io_timing = on # Enable for blk_read_time / blk_write_time columns
After restarting, create the extension once per database:
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
Finding the biggest load contributors
By total execution time — the queries consuming the most cumulative database time:
SELECT
query,
calls,
round(total_exec_time::numeric, 2) AS total_ms,
round(mean_exec_time::numeric, 2) AS mean_ms,
round(
100.0 * total_exec_time / sum(total_exec_time) OVER (),
2
) AS pct_of_total,
rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
By mean execution time — the slowest individual calls:
SELECT
query,
calls,
round(mean_exec_time::numeric, 2) AS mean_ms,
round(max_exec_time::numeric, 2) AS max_ms,
round(stddev_exec_time::numeric, 2) AS stddev_ms
FROM pg_stat_statements
WHERE calls > 50
ORDER BY mean_exec_time DESC
LIMIT 10;
High stddev_exec_time relative to mean_exec_time signals inconsistent performance — the query sometimes runs fast and sometimes does not. This often indicates lock contention, autovacuum interference, or plan instability.
By disk reads — queries with poor cache hit ratios:
SELECT
query,
calls,
shared_blks_read,
shared_blks_hit,
round(
100.0 * shared_blks_hit / NULLIF(shared_blks_hit + shared_blks_read, 0),
1
) AS cache_hit_pct,
round(mean_exec_time::numeric, 2) AS mean_ms
FROM pg_stat_statements
WHERE shared_blks_hit + shared_blks_read > 1000
ORDER BY cache_hit_pct ASC
LIMIT 10;
By temp file usage — queries spilling to disk (insufficient work_mem):
SELECT
query,
calls,
temp_blks_written,
round(mean_exec_time::numeric, 2) AS mean_ms
FROM pg_stat_statements
WHERE temp_blks_written > 0
ORDER BY temp_blks_written DESC
LIMIT 10;
After a tuning change, reset the statistics to get a clean baseline:
SELECT pg_stat_statements_reset();
pg_stat_activity: Queries Running Right Now
While pg_stat_statements gives you historical aggregates, pg_stat_activity shows what every backend process is doing at this moment.
Finding long-running queries
SELECT
pid,
now() - query_start AS duration,
state,
wait_event_type,
wait_event,
left(query, 120) AS query
FROM pg_stat_activity
WHERE state != 'idle'
AND query_start < now() - interval '30 seconds'
ORDER BY query_start ASC;
Detecting idle-in-transaction connections
A connection in the idle in transaction state has an open transaction but is not executing any SQL. This is dangerous: it holds locks and prevents autovacuum from reclaiming dead rows on affected tables.
SELECT
pid,
now() - xact_start AS transaction_duration,
state,
client_addr,
application_name,
left(query, 120) AS last_query
FROM pg_stat_activity
WHERE state IN ('idle in transaction', 'idle in transaction (aborted)')
ORDER BY xact_start ASC;
Terminate these automatically by setting a timeout:
# postgresql.conf
idle_in_transaction_session_timeout = 5min
Cancelling or terminating a backend
-- Cancel the current query (connection remains open)
SELECT pg_cancel_backend(pid);
-- Terminate the connection entirely
SELECT pg_terminate_backend(pid);
Wait event analysis
When backends are blocked — not running slowly, but waiting — wait events tell you what they are waiting for:
SELECT
wait_event_type,
wait_event,
count(*) AS backends
FROM pg_stat_activity
WHERE state != 'idle'
GROUP BY wait_event_type, wait_event
ORDER BY backends DESC;
Key wait events:
| Type | Event | Meaning |
|---|---|---|
Lock |
relation, tuple, transactionid |
Lock contention |
IO |
DataFileRead |
Disk read (cache miss) |
IO |
WALWrite |
Waiting for WAL flush |
Client |
ClientRead |
Waiting for client to send data |
EXPLAIN ANALYZE: Diagnosing a Specific Query
Once you identify which query is slow, EXPLAIN ANALYZE shows the actual execution plan with real timing and row counts alongside the planner's estimates.
-- The standard form for performance diagnosis
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE customer_id = 42;
-- Full output including planning-time settings and WAL stats
EXPLAIN (ANALYZE, BUFFERS, VERBOSE, SETTINGS, WAL) SELECT * FROM orders WHERE customer_id = 42;
-- JSON format for visualizer tools
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) SELECT * FROM orders WHERE customer_id = 42;
EXPLAIN ANALYZE executes the query. For DML, wrap it in a transaction to roll back:
BEGIN;
EXPLAIN ANALYZE UPDATE orders SET status = 'shipped' WHERE id = 1;
ROLLBACK;
What to look for
Sequential scans on large tables with high filter selectivity:
Seq Scan on orders (cost=0.00..9821.00 rows=1 width=44) (actual time=0.012..892.430 rows=234 loops=1)
Filter: (customer_id = 42)
Rows Removed by Filter: 4999766
Nearly 5 million rows read to return 234. This is a missing index.
Row estimation errors — when the planner estimated 1 row and found 234, it likely chose a suboptimal plan. Compare rows=Z (estimate) to rows=C (actual). A 10x or greater mismatch warrants investigation.
Sort spills to disk:
Sort Method: external merge Disk: 48256kB
The sort did not fit in work_mem. Increase it and re-test.
Hash join batches > 1:
Hash (actual time=245.3..245.3 rows=500000 loops=1)
Buckets: 131072 Batches: 8 Memory Usage: 4096kB
Batches: 8 means the hash table spilled to disk 8 times. work_mem is too small for this join.
Nested Loop with high loop count — check actual time × loops for the inner node. If the inner node shows actual time=0.09 rows=1 loops=50000, total inner time is 4,500 ms.
See the EXPLAIN ANALYZE guide for full coverage of plan node types and reading the output.
auto_explain: Automatic plan capture in production
The auto_explain extension logs query plans for slow queries automatically, without modifying application code:
# postgresql.conf (requires restart)
shared_preload_libraries = 'pg_stat_statements,auto_explain'
auto_explain.log_min_duration = 1000 # Log plans for queries over 1 second
auto_explain.log_analyze = true # Include actual timing (adds overhead)
auto_explain.log_buffers = true # Include buffer statistics
auto_explain.log_format = json # JSON format for visualizer tools
Warning: log_analyze = true instruments every qualifying query with per-node timing. This adds measurable overhead on high-throughput systems. Use log_timing = false to capture row counts without timing cost, or set sample_rate (0.0–1.0) to capture a fraction of qualifying queries.
Autovacuum and Table Bloat
PostgreSQL's MVCC model retains old row versions ("dead tuples") after updates and deletes until autovacuum reclaims them. When autovacuum cannot keep up with write volume, dead tuples accumulate into table bloat.
How bloat causes slow queries
- Sequential scans read dead tuples alongside live rows, inflating I/O
- Indexes grow larger as they retain entries pointing to dead heap rows
- More data to read means more buffer cache pressure and longer scans
- Bloat can cause 2–10x slowdowns on queries scanning large table fractions
Checking dead tuple accumulation
SELECT
schemaname,
relname AS table_name,
n_live_tup,
n_dead_tup,
round(
n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100,
2
) AS dead_pct,
last_autovacuum,
last_autoanalyze,
pg_size_pretty(pg_total_relation_size(schemaname || '.' || relname)) AS total_size
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 20;
Tables with dead_pct > 10% need attention. For a more precise measurement, use the pgstattuple extension:
CREATE EXTENSION IF NOT EXISTS pgstattuple;
-- Approximate (fast — no full scan)
SELECT * FROM pgstattuple_approx('orders');
-- Exact (full table scan — can be slow)
SELECT * FROM pgstattuple('orders');
Running VACUUM manually
-- Reclaim dead tuple space
VACUUM orders;
-- Reclaim dead tuples and refresh planner statistics in one pass
VACUUM ANALYZE orders;
VACUUM FULL reclaims on-disk space but acquires an exclusive lock that blocks all reads and writes. For production tables where downtime is not acceptable, use pg_repack or pg_squeeze instead.
Tuning autovacuum for high-churn tables
The default autovacuum_vacuum_scale_factor = 0.2 triggers vacuum when 20% of rows are dead. On a 50-million-row table, that is 10 million dead tuples before vacuum runs. Tune per-table for tables with heavy writes:
ALTER TABLE orders SET (
autovacuum_vacuum_scale_factor = 0.01, -- Trigger at 1% dead rows
autovacuum_vacuum_threshold = 1000,
autovacuum_analyze_scale_factor = 0.005,
autovacuum_vacuum_cost_delay = 2 -- ms; lower = faster vacuum (more I/O)
);
Missing and Unused Indexes
Identifying missing indexes
pg_stat_user_tables tracks cumulative scan counts. High sequential scan counts on large tables often signal missing indexes:
SELECT
schemaname,
relname AS table_name,
seq_scan,
idx_scan,
n_live_tup AS live_rows,
round(
seq_scan::numeric / NULLIF(seq_scan + idx_scan, 0) * 100,
2
) AS seq_scan_pct,
pg_size_pretty(pg_relation_size(schemaname || '.' || relname)) AS size
FROM pg_stat_user_tables
WHERE n_live_tup > 10000
AND seq_scan > 100
ORDER BY seq_scan * n_live_tup DESC
LIMIT 20;
High seq_scan_pct on tables with large n_live_tup are candidates for new indexes. Cross-reference with pg_stat_statements to identify which specific queries are driving those scans.
Identifying unused indexes
Unused indexes waste disk space and add overhead to every write operation:
SELECT
schemaname,
tablename,
indexname,
idx_scan AS times_used,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelname NOT LIKE 'pg_toast_%'
AND indexrelid NOT IN (
SELECT conindid FROM pg_constraint WHERE contype IN ('p', 'u')
)
ORDER BY pg_relation_size(indexrelid) DESC;
Before dropping any index, verify how long statistics have been accumulating — an index with idx_scan = 0 is safe to drop only if the stats cover a representative period of production traffic:
SELECT datname, stats_reset
FROM pg_stat_database
WHERE datname = current_database();
Index types
B-tree (default): Equality, range, prefix LIKE, IS NULL, ORDER BY. Use for almost all cases.
BRIN: Stores min/max per block range. Tiny footprint. Ideal for append-only time-series tables where rows are inserted in roughly physical order. A BRIN index on a billion-row table can be under 1 MB.
CREATE INDEX idx_events_ts ON events USING BRIN(occurred_at);
GIN: Arrays, jsonb, tsvector (full-text search), trigrams (pg_trgm for LIKE '%pattern%').
-- JSONB containment
CREATE INDEX idx_events_metadata ON events USING GIN(metadata);
-- LIKE '%pattern%' (requires pg_trgm)
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX idx_users_name_trgm ON users USING GIN(name gin_trgm_ops);
GiST: Geometric types, PostGIS, range type exclusion constraints.
Partial indexes: Index only the rows matching a WHERE condition. Smaller, faster to maintain, and often allows the planner to use the index where a full index would not be selective enough.
-- Index only pending jobs, not completed ones
CREATE INDEX idx_jobs_pending ON jobs(created_at)
WHERE status = 'pending';
Expression indexes: Index the result of a function or expression.
-- Case-insensitive email lookup
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
Covering indexes (PostgreSQL 11+): INCLUDE adds columns to index leaf pages without making them part of the key, enabling Index Only Scans when those columns appear in the SELECT list.
CREATE INDEX idx_orders_user ON orders(user_id)
INCLUDE (status, total_amount, created_at);
Always use CONCURRENTLY in production to avoid blocking writes:
CREATE INDEX CONCURRENTLY idx_orders_customer_id ON orders(customer_id);
Lock Contention
Lock contention causes latency spikes that are unrelated to query complexity — a fast query can take 30 seconds if it is waiting for a lock. log_lock_waits = on logs every lock wait exceeding deadlock_timeout (default: 1s) to the PostgreSQL log.
Finding blocked queries
SELECT
blocked.pid AS blocked_pid,
left(blocked.query, 100) AS blocked_query,
now() - blocked.query_start AS blocked_duration,
blocking.pid AS blocking_pid,
left(blocking.query, 100) AS blocking_query,
blocking.state AS blocking_state
FROM pg_stat_activity blocked
JOIN pg_stat_activity blocking
ON blocking.pid = ANY(pg_blocking_pids(blocked.pid))
WHERE pg_blocking_pids(blocked.pid) IS DISTINCT FROM '{}'::int[]
ORDER BY blocked_duration DESC;
pg_blocking_pids(pid) (PostgreSQL 9.6+) returns the PIDs blocking a given backend. Avoid calling it in tight monitoring loops — it acquires a lock on the lock manager.
Preventing runaway lock waits
-- For a session
SET lock_timeout = '5s';
-- Within a specific transaction
BEGIN;
SET LOCAL lock_timeout = '2s';
UPDATE orders SET status = 'shipped' WHERE id = 999;
COMMIT;
Key Configuration Settings
work_mem
The maximum memory per sort or hash operation per query. A complex query may use this budget several times simultaneously, and each parallel worker gets its own allocation.
Default is 4 MB — appropriate for small OLTP queries but too small for sorting or joining large tables.
work_mem = 32MB # Conservative default for OLTP workloads
Spills to disk appear in EXPLAIN as Sort Method: external merge Disk: or Hash Batches: N > 1. Increase work_mem for those sessions or roles:
SET work_mem = '256MB'; -- Session-level
ALTER ROLE reporting SET work_mem = '256MB'; -- Role-level
Setting it globally too high risks OOM when many connections run concurrent operations. Start conservatively and increase per-role.
effective_cache_size
Not a memory allocation — a hint to the query planner about the total memory available for caching (shared buffers + OS page cache). A correct value helps the planner favor index scans over sequential scans on large tables.
effective_cache_size = 12GB # 75% of RAM on a dedicated 16GB server
If set too low, the planner under-estimates index scan efficiency and may choose sequential scans instead.
random_page_cost
The planner's estimate of random I/O cost relative to sequential I/O. Default is 4.0, designed for spinning HDDs. For SSDs:
random_page_cost = 1.1 # Local NVMe SSD
random_page_cost = 1.5 # Network-attached SSD (EBS, cloud block storage)
Setting this incorrectly on HDDs causes the planner to over-use indexes at the cost of excessive random I/O.
enable_seqscan (debugging only)
Never set globally. Use per-session to force the planner away from sequential scans and reveal alternative plans:
SET enable_seqscan = off;
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 42;
SET enable_seqscan = on;
If this reveals a faster plan, the real fix is stale statistics, a missing index, or misconfigured planner settings.
Common Causes and Fixes
Missing index
The most common cause. Identify via pg_stat_user_tables.seq_scan and pg_stat_statements. Add an appropriate index type and run ANALYZE afterward.
Non-sargable predicates
Wrapping a column in a function prevents index use:
-- BAD: cannot use an index on created_at
SELECT * FROM orders WHERE DATE(created_at) = '2025-01-15';
-- GOOD: range predicate on the raw column uses the index
SELECT * FROM orders
WHERE created_at >= '2025-01-15' AND created_at < '2025-01-16';
When filtering on a function result is unavoidable, create an expression index on that exact expression.
Stale statistics
After large bulk loads or deletions, the planner's estimates may be far off. Run ANALYZE to refresh statistics:
ANALYZE orders;
For columns with skewed distributions, increase the per-column statistics target:
ALTER TABLE orders ALTER COLUMN status SET STATISTICS 500;
ANALYZE orders;
N+1 queries
Appears in pg_stat_statements as a query with very high calls and low mean_exec_time. Fix at the application layer: use JOINs, IN (...) preloading, or ORM eager loading.
Table bloat
Detected via pg_stat_user_tables.n_dead_tup. Fix by tuning autovacuum or running VACUUM ANALYZE manually. For severe bloat on large tables without maintenance windows, use pg_repack.
Work_mem too small (sort/hash spills)
Detected via pg_stat_statements.temp_blks_written or Sort Method: external merge in EXPLAIN output. Increase work_mem for affected roles.
Idle-in-transaction connections
Detected via pg_stat_activity where state = 'idle in transaction'. These hold locks that block other queries. Set idle_in_transaction_session_timeout and fix the application to always close transactions promptly.
Hot standby query conflicts on replicas
On read replicas, queries are cancelled when the primary sends WAL that conflicts with data the replica is reading. Symptom: ERROR: canceling statement due to conflict with recovery.
# On the replica
max_standby_streaming_delay = 30s
hot_standby_feedback = on
Too many connections
At hundreds to thousands of connections, the memory and CPU overhead of idle backend processes degrades cluster-wide performance. Each PostgreSQL backend consumes 5–10 MB even when idle. The fix is a connection pooler — see PostgreSQL Connection Pooling with PgBouncer.
pgBadger for Log Analysis
pgBadger generates detailed HTML reports from PostgreSQL log files. It is useful for analyzing slow query patterns across hours or days of historical logs without querying the live database. Version 13.0 was released in December 2024.
Setup
pgBadger requires structured logs:
# postgresql.conf
log_destination = 'csvlog'
logging_collector = on
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '
log_min_duration_statement = 200
log_checkpoints = on
log_lock_waits = on
log_temp_files = 0
log_autovacuum_min_duration = 250ms
Usage
# Install (Debian/Ubuntu)
sudo apt install pgbadger
# Generate HTML report from a log file
pgbadger /var/log/postgresql/postgresql-2025-01-15.log -o report.html
# Parallel processing for large logs
pgbadger --jobs 4 /var/log/postgresql/*.log.gz -o report.html
# CSV log format (recommended — cleaner parsing)
pgbadger --format csv /var/log/postgresql/postgresql-*.csv -o report.html
pgBadger reports include: slowest queries by max and total time, most-called queries, queries generating the most temporary files, lock wait events, deadlocks, and checkpoint activity.
Diagnostic Workflow
When slow queries are reported:
- Check
pg_stat_activityfor currently running slow or blocked queries and wait events. - Query
pg_stat_statementsordered bytotal_exec_time DESCto find the biggest load contributors since the last stats reset. - Run
EXPLAIN (ANALYZE, BUFFERS)on the top offending queries. Look for sequential scans on large tables, sort spills, hash join batches > 1, and row estimation errors. - Check
pg_stat_user_tablesfor dead tuple bloat (dead_pct) and missing index signals (seq_scancounts). - Check
pg_stat_user_indexesfor unused indexes consuming write overhead. - Review logs for lock wait events, deadlocks, and temp file creation.
- Apply fixes: add indexes, run
VACUUM ANALYZE, tunework_mem, tune autovacuum, add a connection pooler.