Debugging Slow Queries

Problem / Context

You’re on-call and a critical endpoint slowed down after a feature launch. Product dashboards time out and customers see spinners. You need a repeatable way to find the slowest queries, understand why the planner chose a bad path, and apply the smallest fix (index, rewrite, fresh stats) to restore performance.

Slow queries hurt performance. PostgreSQL has tools to find and fix them.

Core Concept

Slow query triage (make it fast)

Goal: Find why a query is slow, fix cheap things first, verify improvement.

Implementation (step by step SQL)

1. Get the plan

EXPLAIN SELECT * FROM orders WHERE user_id = 123;
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE user_id = 123;

Look for:

2. Cheap fix checklist

Symptom Likely Fix
Seq Scan on selective column CREATE INDEX (col)
Using function on column in WHERE Expression index or precomputed column
Repeated large sort Index on ORDER BY columns
Low correlation after update VACUUM (ANALYZE) or just ANALYZE
Underestimated rows Increase statistics target (ALTER TABLE … ALTER COLUMN … SET STATISTICS n)

3. pg_stat_statements (top offenders)

Prerequisite: the extension must be preloaded at server start.

Example (docker-compose):

services:
    db:
        image: postgres:17.4
        ports:
            - ${DB_HOST}:${DB_PORT}:5432
        command: >-
            -c shared_preload_libraries=pg_stat_statements
            -c pg_stat_statements.track=all
            -c pg_stat_statements.max=10000
        environment:
            POSTGRES_DB: ${DB_NAME}
            POSTGRES_USER: ${DB_USER}
            POSTGRES_PASSWORD: ${DB_PASS}
        volumes:
            - postgres_data:/var/lib/postgresql/data
            - ./internal/:/tmp/postgres/

volumes:
    postgres_data:

Then enable in your database:

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

PostgreSQL 15/16+ (modern columns):

SELECT
    toplevel,
    queryid,
    calls,
    total_plan_time AS plan_ms,
    total_exec_time AS exec_ms,
    ROUND((total_exec_time / NULLIF(calls,0))::numeric, 2) AS mean_exec_ms,
    rows,
    query
FROM pg_stat_statements
WHERE toplevel IS TRUE
ORDER BY total_exec_time DESC
LIMIT 10;

Note: total_plan_time, total_exec_time (PG 15/16+) and total_time (PG 13–14) are already in milliseconds. No need to multiply by 1000. The mean above is ms per call.

PostgreSQL 13–14 (legacy column names):

SELECT
    calls,
    total_time   AS total_ms,
    mean_time    AS mean_ms,
    rows,
    query
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 10;

Tips

4. Lock wait vs compute

If runtime is long but buffers and rows are low, it may be waiting on a lock. See Locks for diagnosing blockers and deadlocks.

SELECT pid, now()-query_start AS run, wait_event_type, wait_event, query
FROM pg_stat_activity WHERE state <> 'idle' ORDER BY run DESC;

5. Refresh statistics

Outdated stats cause bad plans.

ANALYZE orders; -- or entire DB

For large changes, autovacuum might lag.

6. Avoid SELECT *

Pulling unused large columns increases I/O:

SELECT id, user_id, status FROM orders WHERE user_id=123;

This avoids visiting TOAST for large JSON/text.

7. Batch backfills and updates

A long UPDATE that touches many rows can block. Process in chunks:

UPDATE orders SET flag=true WHERE ... LIMIT 1000; -- repeat in app/job

(Or use ORDER BY with a primary key cursor.)

8. Log slow queries

Set log_min_duration_statement = 500 (ms). Keep it practical; too low is noisy. Rotate and parse logs.

9. EXPLAIN ANALYZE caution

It runs the query. For destructive queries, wrap in a transaction and ROLLBACK, or test on staging.

10. Index creation safety

CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders(user_id);

This avoids a long ACCESS EXCLUSIVE lock. It needs a separate transaction and cannot be inside another transaction block.

11. Common pitfalls

12. Verification loop

  1. Capture baseline (mean_time, buffers, runtime)
  2. Apply change (index / rewrite)
  3. Re-run EXPLAIN (ANALYZE, BUFFERS)
  4. Confirm timing + buffer reductions
  5. Drop any now-redundant index

13. Summary checklist

Fix the highest total cost query, then repeat. Small consistent wins beat big refactors.

Variations and Trade‑Offs

Pitfalls

Recap (Short Summary)

Measure with EXPLAIN (ANALYZE, BUFFERS), fix the smallest cause first, and verify with before/after timing and buffers. Keep indexes purposeful and trim SELECT lists.

Optional Exercises

Summary (Cheat Sheet)

Loop: Measure → Hypothesize → Change → Verify → Document.

References