samkhya-postgres
PostgreSQL extension adapter for samkhya — portable cardinality correction for embedded analytical engines.
This crate ships a pgrx based PostgreSQL extension that exposes samkhya's portable sketch and Puffin sidecar primitives to SQL.
Build modes
The crate has two build modes, controlled by the pg_extension Cargo
feature:
- Default (
pg_extensionoff): emptyrlib. Compiles in seconds without PostgreSQL development headers. This is whatcargo check --workspacebuilds in CI. Suitable for downstream crates that wantsamkhya-postgresin their dependency graph without forcing every consumer to installlibpq-dev. pg_extensionon: pulls in pgrx and compiles the real PostgreSQL loadable module. Requires PostgreSQL development headers (postgresql-server-dev-NNon Debian/Ubuntu,postgresql-develon RHEL/Fedora) and the matchingcargo-pgrxtoolchain.
Quickstart (extension build)
# 1. Install the pgrx CLI.
# 2. One-time pgrx init — downloads and builds the supported PG
# majors into ~/.pgrx (skip the ones you don't need with --pg16
# etc.). Pick the version you plan to develop against.
# 3. From the workspace root, run the extension inside an ephemeral
# PostgreSQL 16 (or 17) instance with psql attached.
# Inside psql:
# CREATE EXTENSION samkhya_postgres;
SQL surface
samkhya_hll_count(input anyarray) -> bigint
Builds a samkhya HllSketch (precision 14) from the input array and
returns its estimated distinct-element count.
SELECT samkhya_hll_count(array_agg(id)) FROM foo;
SELECT samkhya_hll_count(ARRAY[1, 2, 2, 3, 3, 3]::int[]);
samkhya_puffin_inspect(path text) -> jsonb
Opens an Iceberg Puffin
sidecar file on the server filesystem and returns per-blob metadata
(kind, fields, offset, length, compression_codec).
SELECT samkhya_puffin_inspect('/srv/iceberg/sketches/orders.puffin');
Output shape:
Scope
This is the v1.0 scaffold. It establishes the extension surface,
crate layout, and pgrx feature gating. The operator-side cardinality
hook (replacing get_relation_info_hook so the planner picks up
samkhya's corrected row estimates without per-query SQL changes) is a
v1.1 target.
Testing
# Default-feature check (no PG headers required).
# Extension-side unit tests (requires cargo-pgrx).
License
Apache-2.0, inherited from the workspace.