quant-governor
Governance policy routing for governed compression.
quant-governor decides which compression codec to use for a given
content payload, given the content type, size, accuracy requirements,
and the caller's policy. The decision is deterministic, receipted,
and auditable. Every routing decision can produce a
DegradationReceipt (when fidelity is reduced) or an
ExactFallbackReceipt (when the policy had to fall back to raw).
This is the policy layer for the RecursiveIntell compression stack
— every call to turbo-quant, fib-quant, or gpu-backend is
routed through quant-governor first.
Why a governance layer?
In a governed system, "we compressed your data" is not acceptable. The caller needs to know:
- which codec was used (Raw, Q8, Q4, Turbo, Fib, Polar)
- whether the policy had to fall back to a lower-fidelity codec
- what the estimated accuracy impact is
- whether the original was retained as an exact fallback
quant-governor answers all four with a single CodecDecision and
attached receipts. The decision is deterministic — same
GovernanceRequest + same GovernancePolicy always yields the
same CodecDecision. This makes the routing auditable and
replayable, which is the whole point of having a policy layer.
Quick Start
use ;
Run it: cargo run --example basic_policy.
Codec profiles
| Profile | Description | Use case |
|---|---|---|
Raw |
Uncompressed | Critical accuracy, small payloads |
Q8 |
8-bit scalar quantization | Balanced storage / accuracy |
Q4 |
4-bit scalar quantization | Storage constrained |
Turbo |
TurboQuant (symmetric, reconstructs the original) | Low latency, batched |
Fib |
Fibonacci-weighted (symmetric, reconstructs the original) | Precision sensitive |
Polar |
Polar-only (asymmetric, only score_inner_product / score_l2 work) |
Search-only |
Turbo, Fib, and Polar are the workhorses for embedding
compression. Turbo is the lowest-latency option (no decode
needed — scoring is done in the encoded space). Fib is the
most accurate. Polar is the search-only option (you can never
decode, you can only score).
Admissibility classes
AdmissibilityClass is the caller's stated need, in priority order:
| Class | Behavior |
|---|---|
Critical |
Always routes to Raw. No degradation. |
HighPriority |
Routes to Q8 minimum. Degradation budget ≤ 0.05. |
Standard (default) |
Routes to Q8 or Q4 depending on size. Degradation budget ≤ 0.20. |
Compressible |
Routes to Q4 or Turbo. Degradation budget ≤ 0.40. |
BestEffort |
Routes to Turbo or Polar. Degradation budget ≤ 0.80. |
The default GovernancePolicy is conservative — it routes
Standard to Q8 and bumps to Q4 only for payloads > 1 MB.
Receipts
quant-governor emits two types of receipts that downstream
audit and rollback layers can act on:
ExactFallbackReceipt
Emitted when the policy selects a lossy codec but the caller
supplied admissibility = Critical (or the accuracy_requirement
was above the policy's raw_min_accuracy). The receipt carries:
raw_digest: Digest— BLAKE3 hash of the original payload.compressed_digest: Digest— BLAKE3 hash of the compressed payload.fallback_retention: bool— whether the system kept the compressed form alongside the raw.fallback_reason: Option<String>— the policy reason string.
DegradationReceipt
Emitted when the policy downgraded from a higher-fidelity codec
to a lower-fidelity one (e.g. Q8 → Q4 because the payload
exceeded a size threshold). The receipt carries:
degradation_type: DegradationTypedegraded_by: f64— the actual degradation applied (0.0 to 1.0).bytes_saved: u64— estimated bytes saved by the degradation.accuracy_impact: f64— estimated accuracy impact (0.0 to 1.0, higher = more impact).
These receipts are the audit handle. A system that does governed compression should persist them and expose them to the caller.
What this crate is not
- Not a codec. It does not encode or decode anything. It tells
you which codec to use and emits a receipt. The actual codec
implementation lives in
turbo-quant,fib-quant,gpu-backend, orquant-codec-core. - Not a transport. It does not move bytes. It produces a routing decision as data.
- Not a scheduling system. It does not decide when to compress; it decides which codec.
Test coverage
- 22 integration tests in
tests/covering:- Policy evaluation determinism (same input → same output, 1000×)
- Profile digest stability (BLAKE3, byte-exact across runs)
- Admissibility class transitions (Critical → Raw, etc.)
- Fallback receipt generation (lossy → raw when
admissibilityis Critical) - Degradation triggers (size threshold crossing, accuracy requirement increase, latency budget shrink)
- ContentType routing (Text → Q4, Embedding → Turbo, etc.)
- 3 doctests in the lib.rs doc-comment.
- 1 example:
examples/basic_policy.rs. cargo testclean,cargo clippy --all-targets -- -D warningsclean.
Performance
evaluate() is O(1) — it inspects the request and policy, applies
the routing rules, and produces a decision. No loops over the
payload (the payload isn't even passed in). A million
evaluate() calls on a single core completes in under 1 second.
The receipt construction is also O(1) — BLAKE3 of fixed-size struct fields, no I/O.
MSRV
Rust 1.75 (2021 edition). Stable-only features.
Dependencies
serde,thiserror,chrono,sha2,blake3.tokio(dev only, for the integration tests).libprofile specifies[profile.release]withopt-level = 3, lto = true, codegen-units = 1— this is called out in the Cargo.toml because the routing is on the hot path for the RecursiveIntell memory stack.
Zero platform-specific code. Zero FFI. Zero async runtime in the production crate (the routing function is sync).
License
MIT. See LICENSE-MIT for the full text.
Changelog
See CHANGELOG.md for the release history.
Where it's used
quant-governor is the policy layer for:
turbo-quant— the experimental vector compression sidecar (TurboQuant, PolarQuant, QJL).semantic-memory— every projection import routes throughevaluate()to decide which sidecar to use.scr-runtime-compression— the cross-runtime compression scheduler usesquant-governorto pick the right codec per payload.
Any system that does governed compression (legal/medical/audit
contexts where "we compressed your data" must be a typed,
receipted operation) can adopt quant-governor directly.