Module P-10·29 min read

The ALTER TABLE lock matrix, CREATE INDEX CONCURRENTLY, NOT VALID + VALIDATE CONSTRAINT two-phase pattern, adding NOT NULL columns safely on 500M-row tables, column rename via view aliasing, lock_timeout + retry in migration tools, and the exact migration sequences that have taken production sites down.

P-10 — Zero-Downtime Schema Migrations

The most dangerous moment in any PostgreSQL application's life is not when the database is under load — it's when someone runs an ALTER TABLE on a 500-million-row table during business hours. The table locks. Queries queue. The queue hits connection limits. The application goes down. This module is the playbook for never letting that happen.

The ALTER TABLE Lock Matrix

Every schema change acquires a lock. Most acquire AccessExclusiveLock — the most restrictive lock, which blocks ALL reads and writes for the duration. Know what locks what:

Operation	Lock Level	Blocks reads?	Blocks writes?	Safe under load?
`CREATE INDEX CONCURRENTLY`	ShareUpdateExclusiveLock	No	No	Yes
`CREATE INDEX`	ShareLock	No	Yes	No
`ADD COLUMN` (nullable, no default)	AccessExclusiveLock	Yes	Yes	Brief
`ADD COLUMN ... DEFAULT ...` (PG11+)	AccessExclusiveLock	Yes	Yes	Brief
`ADD COLUMN ... NOT NULL` (no default)	AccessExclusiveLock	Yes	Yes	No — rewrites table
`ALTER COLUMN TYPE`	AccessExclusiveLock	Yes	Yes	No — rewrites table
`ADD CONSTRAINT ... NOT VALID`	ShareUpdateExclusiveLock	No	No	Yes
`VALIDATE CONSTRAINT`	ShareUpdateExclusiveLock	No	No	Yes
`DROP COLUMN`	AccessExclusiveLock	Yes	Yes	Brief (mark only)
`RENAME COLUMN`	AccessExclusiveLock	Yes	Yes	No
`RENAME TABLE`	AccessExclusiveLock	Yes	Yes	No

"Brief" means the lock duration is proportional to the table's metadata size, not its row count. Still dangerous under high concurrency because queued queries accumulate. A 50ms metadata-only lock on a table receiving 2,000 queries/second can queue 100 requests before it releases.

CREATE INDEX CONCURRENTLY — Safe Indexing on Live Tables

Standard CREATE INDEX acquires a ShareLock that blocks all writes for the entire duration. On a 100M row table, that's 5–30 minutes of blocked writes.

CREATE INDEX CONCURRENTLY builds the index without blocking:

sql
-- WRONG — blocks all writes for minutes on large tables
CREATE INDEX idx_orders_user_id ON orders(user_id);

-- CORRECT — builds the index without blocking writes
CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders(user_id);

The mechanics: CONCURRENTLY makes two passes over the table. Between passes, it waits for any transactions that started before the first pass to complete. Total time is 2–3x longer than standard CREATE INDEX, but zero write blocking.

The failure case: If CREATE INDEX CONCURRENTLY fails mid-build (a unique constraint violation discovered during the build, a statement timeout, a killed connection), it leaves an INVALID index:

sql
-- Find INVALID indexes
SELECT
  schemaname,
  tablename,
  indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_indexes
JOIN pg_class ON relname = indexname
JOIN pg_index ON indexrelid = pg_class.oid
WHERE indisvalid = false;

An INVALID index is invisible to the query planner but still accumulates write overhead — every INSERT, UPDATE, DELETE on the table touches the INVALID index. Drop it immediately:

sql
DROP INDEX CONCURRENTLY idx_orders_user_id;   -- drop also non-blocking
CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders(user_id);

Always check for INVALID indexes after your migration pipeline runs. Make it part of your post-migration health check.

Adding NOT NULL Columns Safely

Pre-PostgreSQL 11: Adding any column with a default value rewrites the entire table — every row gets the new column with the default value written to disk. On a 500M row table, this holds AccessExclusiveLock for hours.

PostgreSQL 11+: Adding a column with a NOT NULL DEFAULT that is a constant (not a function call, not now(), not gen_random_uuid()) uses metadata-only storage. The default is stored in the table's catalog, not in each row. The ALTER is instant regardless of table size.

sql
-- PG11+ with a constant default — instant, no table rewrite
ALTER TABLE orders ADD COLUMN processed boolean NOT NULL DEFAULT false;

-- PG11+ with a non-constant default — still rewrites the table
ALTER TABLE orders ADD COLUMN created_at timestamptz NOT NULL DEFAULT now();
-- Dangerous on large tables. now() is volatile, so PG must write the value to each row.

-- Safe pattern for volatile defaults: add nullable first, backfill, then constrain
ALTER TABLE orders ADD COLUMN created_at timestamptz;            -- instant
UPDATE orders SET created_at = NOW() WHERE created_at IS NULL;   -- batch backfill
ALTER TABLE orders ALTER COLUMN created_at SET NOT NULL;         -- then constrain

The three-step pattern acquires AccessExclusiveLock twice — but both times for only metadata operations. The expensive work (scanning and writing rows) happens in the UPDATE, which holds no table lock beyond normal row-level locks.

For very large tables, batch the UPDATE:

sql
-- Batch backfill to avoid long-running transactions and lock contention
DO $$
DECLARE
  batch_size INT := 10000;
  last_id    BIGINT := 0;
  max_id     BIGINT;
BEGIN
  SELECT MAX(id) INTO max_id FROM orders;

  WHILE last_id < max_id LOOP
    UPDATE orders
    SET created_at = NOW()
    WHERE id > last_id
      AND id <= last_id + batch_size
      AND created_at IS NULL;

    last_id := last_id + batch_size;
    PERFORM pg_sleep(0.01);  -- 10ms pause between batches to let autovacuum keep up
  END LOOP;
END $$;

The NOT VALID + VALIDATE CONSTRAINT Two-Phase Pattern

Adding a foreign key or check constraint with plain ADD CONSTRAINT scans every existing row to verify the constraint holds. On large tables, this holds ShareRowExclusiveLock (blocks concurrent inserts and updates) for minutes.

The two-phase approach:

Phase 1: Add the constraint as NOT VALID — no historical rows are checked, only rows written after this point are validated. This is fast regardless of table size.

Phase 2: Validate separately — VALIDATE CONSTRAINT checks existing rows using ShareUpdateExclusiveLock, which allows concurrent reads and most writes.

sql
-- Phase 1: add constraint without scanning historical rows (instant)
ALTER TABLE orders
  ADD CONSTRAINT fk_orders_users
  FOREIGN KEY (user_id) REFERENCES users(id)
  NOT VALID;

-- Phase 2: validate existing rows (non-blocking for reads, can run during business hours)
-- This takes time proportional to table size, but doesn't block reads
ALTER TABLE orders VALIDATE CONSTRAINT fk_orders_users;

Why does Phase 2 work? VALIDATE CONSTRAINT uses ShareUpdateExclusiveLock which is compatible with concurrent reads and DML. It does a full table scan but doesn't block your application. Run it during off-peak hours if the table is very large.

The same pattern applies to CHECK constraints:

sql
-- Add constraint without scanning existing rows
ALTER TABLE products
  ADD CONSTRAINT chk_price_positive CHECK (price > 0) NOT VALID;

-- Validate later
ALTER TABLE products VALIDATE CONSTRAINT chk_price_positive;

After Phase 2 completes, the constraint is fully enforced for all rows — both existing and future.

Renaming Columns Safely — The Expand/Contract Pattern

RENAME COLUMN acquires AccessExclusiveLock and breaks any code that references the old column name. Even within a maintenance window, you have to deploy application code atomically with the rename — impossible without downtime unless your deployment is instantaneous.

The zero-downtime rename uses the expand/contract pattern: four deployment steps across days or weeks.

Step 1: Add the new column (nullable)

sql
ALTER TABLE users ADD COLUMN full_name text;

This is instant. The old columns (first_name, last_name) are untouched. Application code continues to work.

Step 2: Write to both columns in application code (deploy)

javascript
// Your application now writes to both old and new columns
await db.query(
  'INSERT INTO users (first_name, last_name, full_name) VALUES ($1, $2, $3)',
  [firstName, lastName, `${firstName} ${lastName}`]
);

Deploy this change. All new rows now have both the old columns and the new column populated.

Step 3: Backfill the new column for existing rows

sql
UPDATE users
SET full_name = first_name || ' ' || last_name
WHERE full_name IS NULL;

Run during off-peak hours. For large tables, batch as shown earlier.

Step 4: Add NOT NULL constraint (after backfill completes)

sql
ALTER TABLE users ALTER COLUMN full_name SET NOT NULL;

Step 5: Update application to read from new column only (deploy)

Remove all references to first_name and last_name from your read paths. Leave the write path touching all three columns until the next step.

Step 6: Remove the old columns (deploy)

sql
ALTER TABLE users DROP COLUMN first_name, DROP COLUMN last_name;

Four deployment steps. No downtime. No application-visible lock during any step except the brief metadata-only ALTER TABLE operations.

The same pattern applies to table renames. Never RENAME TABLE in production. Instead, create a view with the old name that points to the new table, migrate code to use the new name, then drop the view.

lock_timeout — Your Safety Net

Every migration that touches a large table should set lock_timeout to prevent queue accumulation:

sql
-- Fail immediately if a lock cannot be acquired within 2 seconds
-- rather than blocking indefinitely and causing query queue buildup
SET lock_timeout = '2s';

ALTER TABLE orders ADD COLUMN status text NOT NULL DEFAULT 'pending';

The queuing problem in detail: Without lock_timeout, a migration waiting for a lock will queue behind existing transactions. PostgreSQL's lock queue is FIFO — every subsequent query that tries to touch that table also queues behind the migration. A migration waiting 30 seconds for a lock can accumulate hundreds of queued connections, exhausting max_connections and bringing down the application. The migration itself may not even be blocking — it's the queue it creates.

With lock_timeout = '2s', the migration fails rather than queue. You retry during a quieter window. The application stays up.

In migration tools, enforce this as a connection-level default:

sql
-- Add to every migration connection via a beforeMigrate callback or init SQL
SET lock_timeout = '3s';
SET statement_timeout = '60s';

yaml
# Flyway: flyway.conf
flyway.initSql=SET lock_timeout='3s'; SET statement_timeout='60s';

statement_timeout is your backup: if the migration itself runs longer than expected (a table rewrite on a table you didn't know was 200GB), it kills the migration rather than holding locks indefinitely.

Column Type Changes Without Table Rewrites

ALTER COLUMN TYPE rewrites the entire table. For large tables, this is an hours-long AccessExclusiveLock. The workaround: new column, backfill, swap.

Pattern: New column + backfill + swap

sql
-- Example: changing varchar(50) to text

-- Step 1: add new column
ALTER TABLE products ADD COLUMN description_new text;

-- Step 2: backfill (batch for large tables)
UPDATE products SET description_new = description;

-- Step 3: swap names in a single transaction (two brief metadata locks)
BEGIN;
ALTER TABLE products RENAME COLUMN description TO description_old;
ALTER TABLE products RENAME COLUMN description_new TO description;
COMMIT;

-- Step 4: validate nothing broke, then drop the old column
ALTER TABLE products DROP COLUMN description_old;

The rename transaction holds AccessExclusiveLock for both renames, but since they're metadata-only operations on the catalog, the lock is held for milliseconds — not proportional to table size.

Pattern: varchar length increases are free

Increasing varchar(N) length in PostgreSQL is a metadata-only operation — no table rewrite:

sql
-- Increasing varchar length — instant, no table rewrite
ALTER TABLE users ALTER COLUMN email TYPE varchar(255);
-- was varchar(100), now varchar(255) — PostgreSQL only updates pg_attribute

Decreasing length requires a constraint check (full table scan) — use the NOT VALID pattern. Changing to a fundamentally different type (e.g., integer to text) always rewrites the table — use the new column + backfill pattern.

Migration Tool Configuration for Safety

Enforce safe practices at the tooling level so individual developers don't have to remember.

yaml
# Flyway flyway.conf
flyway.lockRetryCount=3
flyway.connectRetries=5

# Enforce timeouts on every migration connection
flyway.initSql=SET lock_timeout='3s'; SET statement_timeout='120s';

typescript
// Prisma: use pgbouncer=true in connection string for pooled environments,
// and shadow database for migration testing
// prisma/schema.prisma
datasource db {
  provider          = "postgresql"
  url               = env("DATABASE_URL")
  shadowDatabaseUrl = env("SHADOW_DATABASE_URL")
}

javascript
// node-postgres: set lock_timeout on the migration connection
const migrationClient = new Client({ connectionString: process.env.DATABASE_URL });
await migrationClient.connect();
await migrationClient.query("SET lock_timeout = '3s'");
await migrationClient.query("SET statement_timeout = '120s'");
// run migrations...

For CI/CD pipelines, test migrations against a production-sized copy of the database before applying to production. pg_dump --schema-only + restore + pg_dump --data-only of a sample is often sufficient to catch "this alter will rewrite the table" surprises.

Detecting Lock Waits During Migrations

While a migration is running, monitor pg_stat_activity in a separate session to catch lock pile-ups early:

sql
-- Are any queries waiting for a lock?
SELECT
  pid,
  now() - query_start AS wait_duration,
  state,
  wait_event_type,
  wait_event,
  left(query, 80) AS query
FROM pg_stat_activity
WHERE wait_event_type = 'Lock'
ORDER BY wait_duration DESC;

sql
-- What is holding the lock that others are waiting for?
SELECT
  blocking.pid AS blocking_pid,
  blocking.query AS blocking_query,
  blocked.pid  AS blocked_pid,
  blocked.query AS blocked_query,
  now() - blocked.query_start AS blocked_duration
FROM pg_stat_activity blocked
JOIN pg_stat_activity blocking
  ON blocking.pid = ANY(pg_blocking_pids(blocked.pid))
WHERE blocked.wait_event_type = 'Lock';

If you see this filling up during a migration, the migration is holding a lock that other queries are queuing behind. Kill the migration and investigate before retrying.

The Migration Checklist

Before running any migration on a production table with more than 1 million rows:

[ ] Identify the lock type and duration (check the ALTER TABLE lock matrix)
[ ] Set lock_timeout and statement_timeout on the migration connection
[ ] Test on a production-sized copy first — surprises at 2am are unacceptable
[ ] Schedule for the lowest-traffic window your SLAs allow
[ ] Disable application autoscaling during migration (new instances connecting = more lock contention)
[ ] Have a tested rollback script ready before you start
[ ] Monitor pg_stat_activity in a separate session during execution
[ ] Alert on connection count spike (cl_waiting in PgBouncer, or pg_stat_activity state='idle in transaction')
[ ] For CONCURRENTLY operations: check pg_indexes for indisvalid = false after completion
[ ] For NOT VALID constraints: schedule the VALIDATE CONSTRAINT step separately

The 3am call you're trying to avoid is the one where the migration ran fine in staging (10k rows), then blocked production (500M rows) for 40 minutes while the on-call engineer frantically looked for the pg_cancel_backend command. This checklist exists because that call happens.

Summary

Pattern	When to use	Lock held
`CREATE INDEX CONCURRENTLY`	Adding any index to a live table	ShareUpdateExclusiveLock (non-blocking)
`ADD COLUMN ... DEFAULT constant` (PG11+)	Adding column with a fixed default	AccessExclusiveLock, milliseconds
Add nullable + backfill + SET NOT NULL	Adding column with volatile default	Metadata locks only; backfill is row-level
`NOT VALID` + `VALIDATE CONSTRAINT`	Adding FK or check constraint	Phase 1: instant; Phase 2: ShareUpdateExclusiveLock
Expand/contract (new col + backfill + rename + drop)	Renaming columns or tables	Metadata locks only
New col + backfill + swap	Changing column type	Metadata locks only
`lock_timeout`	Every migration touching large tables	Prevents queue accumulation

Zero-downtime migrations are not about clever hacks — they are about understanding exactly what PostgreSQL locks when, and structuring your changes to never hold an exclusive lock longer than milliseconds.

Next: Module 17 — Logical Replication and CDC Pipelines →

PreviousModule P-9: External Services, Caching Layers, and Deployment Next Module 0: Before You Proceed: Mental Model Reset