pg_dbmigrator 0.2.0

PostgreSQL database migration tool and library (offline dump/restore + online logical replication)
Documentation

Crates.io Version Crates.io Downloads (recent) Crates.io Total Downloads docs.rs codecov

pg_dbmigrator

A Rust library and CLI for migrating PostgreSQL databases between two endpoints, a one-shot dump/restore for cold moves, and an online path that keeps PostgreSQL's built-in logical replication apply worker pulling from the source so the operator can cut over with near-zero downtime.

The online path issues CREATE SUBSCRIPTION on the target attached to a slot we created with EXPORT_SNAPSHOT before pg_dump ran.

Modes

Mode Behaviour
offline Run pg_dump against the source, then pg_restore against the target. One-shot copy.
online Create a logical replication slot with EXPORT_SNAPSHOT, take a snapshot-consistent pg_dump, pg_restore it, then start a streaming WAL apply from the slot's start LSN until the operator triggers cutover.

Online migration phases

Validate → SourceVacuum → PrepareSnapshot → Dump → Restore → Analyze → StreamApply → (Lag heartbeat …) → CaughtUp → Cutover → SourceCleanup → Complete
  • Validate pre-flights the source (wal_level = 'logical', max_replication_slots > 0, max_wal_senders > 0) and ensures the required publication exists — auto-creating it if missing (see Publication lifecycle below).
  • SourceVacuum runs VACUUM ANALYZE on the source to reclaim dead tuples and refresh planner statistics before the dump. Skip with --skip-source-vacuum.
  • PrepareSnapshot creates the replication slot first; START_REPLICATION is deferred until after the dump completes, so the exported snapshot remains valid for the dump.
  • Analyze runs ANALYZE on the target after restore so the query planner has fresh statistics for the first application queries. Skip with --skip-analyze.
  • During StreamApply the library polls pg_current_wal_flush_lsn() on the source every --cutover-poll-secs and emits a Lag progress event with lag_bytes / source_lsn / received_lsn / applied_lsn. This is the signal the customer watches to decide when to cut over.
  • When the lag drops at or below --lag-threshold-bytes a one-shot CaughtUp event is emitted (“ready for cutover”).
  • SourceCleanup (after cutover) drops auto-created publications and replication slots on the source — see the next section.

Install / build

# Install the CLI from source. Produces a binary called `pg_dbmigrator`.
cargo install pg_dbmigrator
pg_dbmigrator --help
pg_dbmigrator --mode offline --source '' --target '' --jobs 4

CLI

Offline

pg_dbmigrator \
    --mode offline \
    --source 'postgres://user:pw@src.example/db' \
    --target 'postgres://user:pw@dst.example/db' \
    --jobs 4 \
    --drop-target-first

By default, VACUUM ANALYZE runs on the source before pg_dump and ANALYZE runs on the target after pg_restore. Disable with --skip-source-vacuum / --skip-analyze if you manage maintenance externally.

Online

On the source, before starting:

ALTER SYSTEM SET wal_level = 'logical';   -- requires restart

The publication is auto-created by the migrator if it does not already exist (default FOR ALL TABLES). If you prefer to create it manually — e.g. to publish only specific tables — run:

CREATE PUBLICATION pg_dbmigrator_pub FOR TABLE my_schema.t1, my_schema.t2;

pass --no-auto-create-publication so the migrator uses the existing one and does not attempt to create or drop it.

pg_dbmigrator \
    --mode online \
    --source 'postgres://user:pw@src/db' \
    --target 'postgres://user:pw@dst/db' \
    --slot-name pg_dbmigrator_slot \
    --publication pg_dbmigrator_pub \
    --subscription-name pg_dbmigrator_sub \
    --jobs 4 \
    --lag-threshold-bytes 8192 \
    --cutover-poll-secs 5

Before the dump runs, the migrator pre-flights the source: wal_level = 'logical', max_replication_slots > 0, and max_wal_senders > 0. A misconfigured source fails fast with a clear error instead of stalling later inside CREATE_REPLICATION_SLOT.

At cutover, the migrator runs setval(...) on every sequence in the included schemas so the target picks up where the source left off — otherwise the first INSERT after cutover would collide with rows the subscription replicated. Disable with --no-sequence-sync if your target role lacks privileges for setval on those sequences.

Filtering

Use --exclude-schema and --exclude-table to omit large or transient objects from the dump. Both flags accept multiple values.

pg_dbmigrator --mode offline \
    --source ... --target ... \
    --exclude-schema audit \
    --exclude-table public.large_log

Publication / replication resource lifecycle

The migrator fully manages the lifecycle of the replication resources it creates, so the operator does not need to run manual cleanup SQL after a successful cutover.

Resource Created by Cleaned up at cutover Override
Publication on source Auto-created if missing (default) Dropped only if it was auto-created --no-auto-create-publication
Replication slot on source Always created by the migrator Dropped by default --keep-slot
Subscription on target Always created by the migrator Dropped by default --keep-subscription

Auto-create publication: By default, the migrator checks whether the named publication (--publication, default pg_dbmigrator_pub) exists on the source. If it does not, the migrator creates it as FOR ALL TABLES (or scoped to --table / --schema if specified). Auto-created publications are tracked and dropped on the source after a successful cutover. Pre-existing publications are never dropped.

Slot cleanup: After cutover, the replication slot on the source is no longer needed. By default the migrator drops it. Pass --keep-slot if you need to inspect the slot post-migration or if another consumer shares it.

All cleanup steps are best-effort — failures are logged as warnings but do not abort the migration.

Cutover (online mode)

Cutover is driven by SIGINT (Ctrl+C). The CLI prints a periodic Lag heartbeat after the dump completes, so the operator has a continuous bytes-behind read-out:

INFO stage=Lag replication lag 4096 bytes (source LSN …, received LSN …, applied LSN …)
INFO stage=Lag replication lag 1024 bytes (…)
INFO stage=CaughtUp target caught up with source (lag 512 bytes) — ready for cutover

When the customer is satisfied with the lag, they press Ctrl+C once:

  • The signal handler calls CutoverHandle::request().
  • The streaming apply loop notices the request on its next poll, flushes the last LSN feedback to the source, emits a Cutover event, and returns.
  • The migrator syncs sequences, cleans up replication resources (publication, slot, subscription), and returns with MigrationOutcome::cutover_triggered() == true. The process exits cleanly. Application traffic can now be switched to the target.
  • A second Ctrl+C is treated as an abort (escape hatch — only use it if the graceful path is stuck).

Cutover is always operator-driven; --lag-threshold-bytes is purely advisory and only controls when the one-shot CaughtUp “ready for cutover” event fires.

For online migrations, hold on to migrator.cutover_handle() and call request() from your own signal handler / RPC endpoint when the operator is ready to cut over. See examples/online_migration for a complete program that wires Ctrl+C to the cutover handle.

Performance defaults

The CLI ships with sensible defaults tuned for migration speed. Override only when you have a specific reason.

Default Flag to override Effect
Split-section restore --no-split-sections Bulk COPY without index maintenance, then rebuild indexes in parallel. 30-60% faster on index-heavy schemas.
lz4:1 dump compression --dump-compress <spec> Negligible CPU, 3-5x smaller archive. Use zstd:3 for better ratio, none to disable.
--no-sync on dump --keep-sync Skip fsync on transient dump files.
--no-comments (not exposed) Omit COMMENT ON statements from dump.
--no-security-labels (not exposed) Omit SE-Linux security labels from dump.
--no-publications --keep-publications Don't dump publication definitions to the target.
--no-subscriptions --keep-subscriptions Don't dump subscription definitions to the target.
Auto-detect --jobs --jobs N Clamps to [1, 8] based on host CPU count.
Pre-dump VACUUM ANALYZE --skip-source-vacuum Clean heap pages + fresh stats before dump.
Post-restore ANALYZE --skip-analyze Fresh planner stats on target immediately after restore.

Benchmark

See BENCHMARK.md for migration performance results across 10 GB -- 200 GB datasets (PG 16 -> PG 18, 8 parallel jobs, zstd compression).

Known limitations

  • The streaming apply loop binds replicated values as text and lets the server cast them. Custom column-level transforms are not supported.
  • DDL changes are not migrated automatically — refresh the publication and restart the migration if the schema changes during the run.
  • Extensions whose internal state cannot be re-created on the target (Azure-reserved extensions, pg_cron metadata, ...) may cause pg_restore to exit with code 1. Pass --allow-restore-errors to treat that as a non-fatal warning when user data was restored successfully.
  • Sequence sync at cutover requires the target role to have permission to call setval() on the destination sequences. Per-sequence failures are logged but do not abort cutover — inspect the warnings and re-setval manually if needed, or pre-grant USAGE on the sequences before the migration.