How to Debug High CPU Usage in PostgreSQL

High CPU usage in PostgreSQL is a symptom, not a diagnosis. The CPU is busy because backends are executing work - scanning rows, sorting, hashing, evaluating expensive functions, or spinning on contended locks. The usual root causes are a small set: an expensive or badly planned query, queries that run far more often than expected, stale statistics that make the planner pick a sequential scan, work that spills to disk because work_mem is too small, lock contention, or table and index bloat that forces backends to read and skip dead tuples. This guide shows how to confirm which one applies and how to fix it.

Common Causes of High CPU

The fastest way to narrow the search is to map each cause to a single confirming check. Run the diagnostics in the next section before changing anything - guessing at the cause is how you end up raising shared_buffers for a problem that was a missing index.

Cause How to confirm Fix direction
Expensive query (large scan, sort, hash, heavy functions) pg_stat_statements shows high total_exec_time or mean_exec_time; EXPLAIN (ANALYZE, BUFFERS) shows a Seq Scan or large Sort Add an index, rewrite the query, tune the plan
High query frequency pg_stat_statements.calls is very high for a cheap query Cache results, batch calls, fix N+1 access patterns
Stale statistics pg_stat_all_tables.last_analyze / last_autoanalyze is old; plan disagrees with actual row counts ANALYZE the table; tune autovacuum
Disk spill from small work_mem EXPLAIN ANALYZE shows Sort Method: external merge Disk or hash batches Raise work_mem for the session, or reduce rows sorted
Lock contention pg_stat_activity.wait_event_type = 'Lock'; blocked PIDs in pg_locks Shorten transactions, fix lock ordering
Table or index bloat High n_dead_tup; physical size far exceeds live rows VACUUM, tune autovacuum
Unused indexes adding write CPU idx_scan near zero on large indexes Drop the index so writes update fewer structures

Expensive computation is real but narrower than people expect. Trigonometric functions, exponentiation, and cryptographic hashes (crypt, digest, repeated md5) cost far more per row than arithmetic, so a WHERE clause or join condition that evaluates one of these over millions of rows can saturate a core on its own. Disk spilling matters because PostgreSQL sorts and hash joins that exceed work_mem (default 4MB) switch to an external, disk-based algorithm that burns both I/O and CPU.

How to Diagnose High CPU

Start with what is running right now, then move to cumulative statistics. The queries below are runnable as-is on PostgreSQL 13 and later.

Check connection saturation first. A flood of active backends means more work competing for cores than the box can run in parallel:

-- Total vs active connections against max_connections
SELECT count(*) AS total,
       count(*) FILTER (WHERE state <> 'idle') AS active,
       current_setting('max_connections')::int AS max_conn,
       round(100.0 * count(*) / current_setting('max_connections')::int, 1) AS pct_used
FROM pg_stat_activity;

Find the most expensive queries cumulatively with pg_stat_statements (enable it via shared_preload_libraries). Note the column names: PostgreSQL 13 renamed total_time to total_exec_time and mean_time to mean_exec_time, splitting out separate planning-time columns:

-- Top queries by total execution time since the last stats reset
SELECT substr(query, 1, 80) AS query,
       calls,
       round(total_exec_time::numeric, 1) AS total_ms,
       round(mean_exec_time::numeric, 2) AS mean_ms
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

A query with low mean_exec_time but enormous calls is a frequency problem; high mean_exec_time is an efficiency problem. Run `EXPLAIN (ANALYZE, BUFFERS)` on the worst offenders to see the plan and whether sorts spill to disk.

Find long-running statements and transactions still holding resources:

-- Active statements and open transactions, longest first
SELECT pid,
       now() - xact_start AS txn_age,
       now() - query_start AS query_age,
       state, wait_event_type, wait_event,
       substr(query, 1, 80) AS query
FROM pg_stat_activity
WHERE state <> 'idle'
ORDER BY xact_start NULLS LAST;

A wait_event_type of Lock points to contention, not CPU work - find the blocker by joining pg_locks on granted = false. An idle in transaction row that has been open for minutes pins old row versions and blocks vacuum, which then feeds bloat.

Check for bloat from dead tuples and for stale statistics in one place:

-- Dead tuples and last (auto)analyze / (auto)vacuum per table
SELECT schemaname, relname, n_live_tup, n_dead_tup,
       last_analyze, last_autoanalyze,
       last_vacuum, last_autovacuum
FROM pg_stat_all_tables
WHERE n_dead_tup > 0
ORDER BY n_dead_tup DESC
LIMIT 20;

A high n_dead_tup relative to n_live_tup means backends scan and discard dead rows on every read - run VACUUM (or VACUUM ANALYZE) on the table. A last_analyze/last_autoanalyze that is null or weeks old means the planner is working from stale row estimates - run ANALYZE table_name. Finally, list indexes that cost write CPU without paying for it:

-- Large indexes that are never scanned
SELECT relname AS table, indexrelname AS index,
       idx_scan,
       pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;

Before trusting cumulative counters, remember they accumulate since startup or the last pg_stat_reset(). Reset and sample over a representative window if old data is masking the current pattern. An index showing zero scans on the primary may still be used on a replica, so check every node before dropping it.

Root-Cause Analysis with Pulse

Correlating a CPU spike to its cause means joining pg_stat_activity, pg_stat_statements, pg_locks, and pg_stat_all_tables by hand, usually after the spike has passed and the evidence is gone. Pulse closes that loop automatically. Its agentic SRE engine samples these views continuously, so when CPU climbs it already has the active statements, their plans, lock waits, dead-tuple counts, and statistics freshness from the moment the spike began. It traces the load to the responsible query, lock holder, or table - a sequential scan from stale statistics, an idle in transaction session blocking vacuum, or a function evaluated over millions of rows - and reports the root cause rather than the raw metric.

Pulse then proposes a concrete change: add this index, ANALYZE this table, raise work_mem for this query shape, or kill this stuck transaction. Recommendations go to a human to approve rather than being applied blindly, which keeps you in control of changes to a production database.

Preventive Measures

Most CPU incidents are slow drifts that crossed a threshold, so the goal is to catch the drift early. Keep autovacuum and autoanalyze aggressive enough for write-heavy tables - lower autovacuum_vacuum_scale_factor and autovacuum_analyze_scale_factor on large, frequently updated tables so dead tuples and stale statistics never accumulate to the point of degrading plans. Set a statement_timeout and an idle_in_transaction_session_timeout so runaway queries and abandoned transactions cannot pin CPU or block vacuum indefinitely. Review pg_stat_statements regularly to catch a query whose calls or mean_exec_time is creeping up, and use a connection pooler to cap active backends instead of letting a thundering herd oversubscribe the cores.

Connecting your database to Pulse's proactive monitoring automates that vigilance: it tracks per-query CPU contribution, bloat growth, statistics age, and connection saturation, and alerts on the trend before it becomes a user-visible incident. See PostgreSQL performance tuning for the broader configuration baseline.

Frequently Asked Questions

Q: How do I find which query is using the most CPU in PostgreSQL?
A: PostgreSQL does not attribute CPU directly to a query, so use execution time as the proxy. Query pg_stat_statements ordered by total_exec_time to find the queries consuming the most processing time cumulatively, then run EXPLAIN (ANALYZE, BUFFERS) on the top ones to see why.

Q: Why does PostgreSQL CPU spike with no obvious slow query?
A: A query that is individually cheap but runs thousands of times per second adds up to high CPU. Check the calls column in pg_stat_statements, not just execution time. Lock contention and autovacuum on a bloated table can also drive CPU without any single statement looking slow.

Q: Can stale statistics cause high CPU usage in PostgreSQL?
A: Yes. The planner uses statistics from ANALYZE to estimate row counts and choose join and scan strategies. When statistics are stale, it can pick a sequential scan or nested loop over millions of rows instead of an index scan, burning CPU. Run ANALYZE on the affected table and check last_autoanalyze in pg_stat_all_tables.

Q: Does high CPU mean I need a bigger PostgreSQL instance?
A: Usually not. Adding cores masks a missing index, stale statistics, or a frequency problem without fixing it, and the cost grows with the workload. Diagnose the root cause first with pg_stat_statements and EXPLAIN ANALYZE; scale the instance only when the workload genuinely exceeds the hardware after the queries are optimized.

Q: How does work_mem relate to CPU usage?
A: When a sort or hash join needs more memory than work_mem (default 4MB), PostgreSQL switches to an external, disk-based algorithm that consumes extra CPU and I/O. If EXPLAIN ANALYZE shows Sort Method: external merge Disk or multiple hash batches, raising work_mem for that session can cut the CPU cost. Raise it per-query, not globally, since each operation can allocate it.

Q: How do I check for lock contention causing high CPU in PostgreSQL?
A: Query pg_stat_activity for rows where wait_event_type = 'Lock', then join pg_locks on granted = false to find blocked PIDs and the sessions holding the locks. Lock waits are not CPU work themselves, but the backends spinning and retrying, plus the long transactions causing the blocks, often appear together with high CPU.

Q: Will VACUUM reduce PostgreSQL CPU usage?
A: It can, indirectly. Dead tuples force every read to scan and skip rows that are no longer visible, wasting CPU. VACUUM removes them and VACUUM ANALYZE also refreshes statistics. The vacuum run itself uses CPU briefly, so schedule heavy manual vacuums off-peak and tune autovacuum to keep bloat from accumulating in the first place.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.