pgrx-bench
pgrx-bench is the runtime support crate behind #[pg_bench] benchmarks in pgrx
extensions.
You'll generally use it together with cargo pgrx bench during extension development. It
exists to make in-process benchmarking of extension code feel natural from Rust while still
respecting the way Postgres extensions are built, installed, loaded, and exercised.
It is not a standalone benchmark runner. The command-line workflow lives in
cargo-pgrx. This crate provides the author-facing runtime API
that benchmark functions use once they are executing inside Postgres.
What This Is
pgrx-bench is intended for in-process benchmarks of extension code.
That means:
- the benchmark function itself runs inside a Postgres backend
- the timed code can call normal Rust functions,
Spi, and other Postgres internals directly - benchmark definitions live next to the rest of your extension code
cargo pgrx benchinstalls a bench-enabled build of your extension into a managed benchmark database and records the results there
This makes it a good fit for questions like:
- "What is the cost of this Rust helper when called from an extension?"
- "How expensive is this SPI-heavy path?"
- "Did this change make the extension function faster or slower compared to the previous run?"
- "How does this benchmark behave across repeated runs on the same managed Postgres instance?"
What This Is Not
pgrx-bench is not:
- a client-side SQL driver benchmark tool
- a load-testing or concurrency-testing framework
- a replacement for regression tests or
#[pg_test] - a tool for benchmarking top-level transaction commit cost
The important design point is that benchmarking happens inside the backend, not by repeatedly issuing SQL over a client connection and timing that from the outside.
If what you need is correctness testing, use cargo pgrx test or cargo pgrx regress.
If what you need is in-process performance measurement of extension code, use cargo pgrx bench.
When To Use It
Reach for #[pg_bench] when:
- you want stable, repeatable microbenchmarks or small mesobenchmarks for extension code
- the code under test is meaningful only inside Postgres
- you want results stored in a database and compared against earlier benchmark runs
- you want to exercise SPI or other backend services directly from the timed code
It is especially useful while iterating on:
- expression evaluation
- parser/planner support logic
- tuple conversion and datum conversion paths
- SPI-heavy data manipulation
- indexing or operator support functions
How It Fits With cargo-pgrx
pgrx-bench is intentionally paired with cargo pgrx bench.
cargo pgrx bench does the operational work:
- enables the
pg_benchCargo feature for the build - builds the extension in
--releasemode by default - installs the extension artifacts using the normal
cargo pgrx installpath - refreshes the extension in a managed database named
$extname_benches - discovers benchmark wrappers
- executes each benchmark in-process
- rolls back benchmark side effects after each benchmark invocation
- persists normalized results, environment metadata, and settings into a runner-owned schema
pgrx-bench handles the runtime side inside the backend:
BencherBatchSizeblack_box- transaction-aware execution support
- serialization of benchmark results back to the CLI runner
Most users should think of this crate as "the runtime API used by #[pg_bench]".
Adding Benchmarks To An Extension
Benchmarks are gated behind a dedicated Cargo feature so they are not included in a normal extension build unless you explicitly ask for them.
Typical Cargo.toml shape:
[]
= ["pgrx/pg16", "pgrx-tests/pg16"]
= []
= ["dep:pgrx-bench"]
[]
= "=0.17.0"
= { = "=0.17.0", = true }
[]
= "=0.17.0"
Notice that pgrx-bench does not need its own pg16/pg17/etc. passthrough feature wiring.
The active Postgres version is already selected on pgrx, and pgrx-bench itself stays free of
pgrx-* dependencies and Postgres-version feature flags. The proc-macro-generated wrapper code in
the extension crate owns the Postgres-specific boundary work instead.
Benchmark functions live under a feature-gated #[pg_schema] mod benches.
use *;
pg_module_magic!;
Benchmark Function Shape
A #[pg_bench] function must accept &mut pgrx_bench::Bencher and must register exactly one
timing loop using either:
b.iter(...)b.iter_batched(...)
The crate exports:
BencherBatchSizeblack_box
The goal is to feel familiar if you have used Criterion before, while still giving pgrx control
over transaction handling and result capture.
black_box
pgrx-bench re-exports Criterion's black_box directly:
use black_box;
Its job is to make a value opaque to the optimizer so the compiler is less able to constant-fold, inline away, or otherwise "solve" part of the benchmark ahead of time.
That matters because microbenchmarks are especially vulnerable to unrealistic optimization. If the compiler can prove that an input is always the same, or that the result is irrelevant, it may remove work that you actually meant to measure.
Typical usage looks like this:
Here there are two useful boundaries:
black_box(input)makes the benchmark input less predictable to the optimizerblack_box(...)around the result makes it less likely that the whole call can be discarded as unused
In practice, the input side is the one you should think about first. If your benchmark uses literals, precomputed fixtures, or values that are obviously constant, wrapping those inputs is often the difference between benchmarking your code and benchmarking what the optimizer managed to precompute.
Why use it if Criterion already does?
Criterion itself already uses black_box internally in a number of timing loops, especially around
routine outputs. That helps, but it is not a substitute for author intent at the benchmark boundary.
You should still use black_box in your benchmark code when:
- the function input is a constant or easy-to-reason-about value
- the computation could be folded because the compiler can "see through" the closure
- you want to make it explicit which values belong to the code under test
Relationship to Criterion's version
This is not a pgrx-specific implementation. pgrx-bench re-exports Criterion's own helper so
the behavior tracks the Criterion version bundled with pgrx-bench.
In the current Criterion version used here, black_box:
- uses
test::black_boxwhen Criterion is built with itsreal_blackboxfeature - otherwise falls back to a stable-compatible implementation based on volatile reads
That fallback is useful and widely used, but it is not magic. It can add some overhead, and it may not block every possible optimization on every compiler and platform. The goal is to make the benchmark substantially more realistic, not to guarantee perfect optimizer isolation.
A few rules of thumb
- Use
black_boxaround benchmark inputs when those inputs are compile-time constants or reused fixtures. - Usually keep
black_boxclose to the code under test, not wrapped around an entire setup phase. - Do not use
black_boxas a substitute for good benchmark structure. Setup still belongs insetup = ...oriter_batched(...), not inside the timed code unless that setup is part of what you intentionally want to measure. - If a benchmark becomes materially different when you remove
black_box, the version withblack_boxis usually the more trustworthy one.
Setup Functions
Benchmarks may optionally declare a one-time setup function:
Setup functions:
- are plain Rust functions
- take no arguments
- run once per benchmark invocation
- run before warmup and before measured samples
- are never included in the timed portion of the benchmark
Use a setup function for benchmark-level fixture preparation. If you need fresh per-iteration or
per-batch input, that belongs in iter_batched(...).
Transaction Semantics
Transaction handling is one of the main reasons pgrx-bench exists as a dedicated runtime.
Supported transaction modes are:
sharedsubtransaction_per_batchsubtransaction_per_iteration
shared
This is the default.
The benchmark invocation shares one outer transaction for setup, warmup, and measured execution. It has the lowest overhead and is often the right choice for pure Rust or read-heavy benchmarks.
subtransaction_per_batch
Each measured batch runs inside an internal Postgres subtransaction. This is usually the best choice for mutating SPI benchmarks because it limits state leakage within the benchmark while keeping overhead manageable.
subtransaction_per_iteration
Each individual iteration runs inside its own internal subtransaction. This offers the cleanest isolation, but also the highest overhead.
Outer rollback behavior
Regardless of the transaction mode chosen for the timed loop, cargo pgrx bench rolls back the
benchmark invocation after collecting its result payload. This means benchmark fixture changes do
not accumulate in the persistent benchmark database.
The benchmark history remains durable because the CLI persists the result rows in a separate runner-owned transaction after the benchmark transaction has been rolled back.
Running Benchmarks
From your extension directory:
$ cargo pgrx bench
Or with an explicit Postgres version:
$ cargo pgrx bench pg16
To run a single benchmark:
$ cargo pgrx bench pg16 bench_normalize_phrase
To list discovered benchmarks and their settings:
$ cargo pgrx bench --list
bench_normalize_phrase [transaction=shared, setup=none, sample_size=100, warm_up=3000ms, measurement=5000ms, nresamples=100000, noise_threshold=0.01, significance_level=0.05]
bench_spi_insert_batch [transaction=subtransaction_per_batch, setup=prepare_spi_fixture, sample_size=50, warm_up=3000ms, measurement=2000ms, nresamples=100000, noise_threshold=0.01, significance_level=0.05]
During execution, the CLI prints which benchmark is currently running and the effective settings for that benchmark.
Result Storage
Benchmark results are stored in a managed database named $extname_benches by default.
Inside that database:
- your extension is installed and refreshed for each bench run
- historical benchmark data is stored in a runner-owned schema named
pgrx_bench - results are grouped by benchmark invocation
- environment metadata and
pg_settingssnapshots are recorded alongside the results
This gives you a durable history of runs while still allowing SQL-visible changes in the extension
to be picked up by a DROP EXTENSION / CREATE EXTENSION refresh cycle.
Comparison Groups
Each cargo pgrx bench invocation creates a named run group.
By default:
- the group name is auto-generated
- the comparison target is the most recent prior completed group
You can override either side explicitly:
$ cargo pgrx bench --group-name before-rewrite
$ cargo pgrx bench --group-name after-rewrite --compare-group before-rewrite
The chosen comparison target is stored with the run group so later SQL analysis and CLI summaries use the same baseline.
What Gets Queried Later
The benchmark runner creates SQL tables and views for historical analysis. In particular, the
pgrx_bench schema is intended to make it easy to answer questions like:
- "What did this benchmark do on the last run?"
- "What changed relative to a named baseline?"
- "What were the Postgres settings during that run?"
- "Which group was this run compared against?"
pgrx-bench itself does not expose the SQL API. That lifecycle and persistence behavior belongs to
cargo-pgrx, but it is part of the overall feature this crate participates in.
What To Expect In Normal Builds
Because benchmarks are feature-gated behind pg_bench, they are normally absent from:
- ordinary
cargo pgrx run - ordinary
cargo pgrx install - ordinary
cargo pgrx package
If you explicitly enable pg_bench for those commands, cargo-pgrx will warn, but it will still
let you proceed. That can be useful for interactive exploration, but it is usually not what you
want for packaged releases.