quant-governor 0.1.0

Governance policy routing for governed compression — codec selection with admissibility classes and degradation receipts
Documentation

quant-governor

Governance policy routing for governed compression.

quant-governor decides which compression codec to use for a given content payload, given the content type, size, accuracy requirements, and the caller's policy. The decision is deterministic, receipted, and auditable. Every routing decision can produce a DegradationReceipt (when fidelity is reduced) or an ExactFallbackReceipt (when the policy had to fall back to raw).

This is the policy layer for the RecursiveIntell compression stack — every call to turbo-quant, fib-quant, or gpu-backend is routed through quant-governor first.

Why a governance layer?

In a governed system, "we compressed your data" is not acceptable. The caller needs to know:

  • which codec was used (Raw, Q8, Q4, Turbo, Fib, Polar)
  • whether the policy had to fall back to a lower-fidelity codec
  • what the estimated accuracy impact is
  • whether the original was retained as an exact fallback

quant-governor answers all four with a single CodecDecision and attached receipts. The decision is deterministic — same GovernanceRequest + same GovernancePolicy always yields the same CodecDecision. This makes the routing auditable and replayable, which is the whole point of having a policy layer.

Quick Start

use quant_governor::{
    AdmissibilityClass, ContentType, evaluate, GovernancePolicy, GovernanceRequest,
};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Build a request describing the payload.
    let request = GovernanceRequest {
        content_type: ContentType::Embedding,
        size_bytes: 4_096,
        accuracy_requirement: 0.95,
        latency_tolerance_ms: 50,
        admissibility: AdmissibilityClass::Standard,
        policy_id: Some("production-v1".into()),
    };

    // Evaluate against a policy.
    let policy = GovernancePolicy::default();
    let decision = evaluate(request, &policy)?;

    // The decision carries the selected codec, the degradation
    // budget, and a receipt.
    println!("Selected codec: {:?}", decision.codec);
    println!("Degradation budget: {}", decision.degradation_budget);
    println!("Receipt: {:?}", decision.receipt);
    Ok(())
}

Run it: cargo run --example basic_policy.

Codec profiles

Profile Description Use case
Raw Uncompressed Critical accuracy, small payloads
Q8 8-bit scalar quantization Balanced storage / accuracy
Q4 4-bit scalar quantization Storage constrained
Turbo TurboQuant (symmetric, reconstructs the original) Low latency, batched
Fib Fibonacci-weighted (symmetric, reconstructs the original) Precision sensitive
Polar Polar-only (asymmetric, only score_inner_product / score_l2 work) Search-only

Turbo, Fib, and Polar are the workhorses for embedding compression. Turbo is the lowest-latency option (no decode needed — scoring is done in the encoded space). Fib is the most accurate. Polar is the search-only option (you can never decode, you can only score).

Admissibility classes

AdmissibilityClass is the caller's stated need, in priority order:

Class Behavior
Critical Always routes to Raw. No degradation.
HighPriority Routes to Q8 minimum. Degradation budget ≤ 0.05.
Standard (default) Routes to Q8 or Q4 depending on size. Degradation budget ≤ 0.20.
Compressible Routes to Q4 or Turbo. Degradation budget ≤ 0.40.
BestEffort Routes to Turbo or Polar. Degradation budget ≤ 0.80.

The default GovernancePolicy is conservative — it routes Standard to Q8 and bumps to Q4 only for payloads > 1 MB.

Receipts

quant-governor emits two types of receipts that downstream audit and rollback layers can act on:

ExactFallbackReceipt

Emitted when the policy selects a lossy codec but the caller supplied admissibility = Critical (or the accuracy_requirement was above the policy's raw_min_accuracy). The receipt carries:

  • raw_digest: Digest — BLAKE3 hash of the original payload.
  • compressed_digest: Digest — BLAKE3 hash of the compressed payload.
  • fallback_retention: bool — whether the system kept the compressed form alongside the raw.
  • fallback_reason: Option<String> — the policy reason string.

DegradationReceipt

Emitted when the policy downgraded from a higher-fidelity codec to a lower-fidelity one (e.g. Q8Q4 because the payload exceeded a size threshold). The receipt carries:

  • degradation_type: DegradationType
  • degraded_by: f64 — the actual degradation applied (0.0 to 1.0).
  • bytes_saved: u64 — estimated bytes saved by the degradation.
  • accuracy_impact: f64 — estimated accuracy impact (0.0 to 1.0, higher = more impact).

These receipts are the audit handle. A system that does governed compression should persist them and expose them to the caller.

What this crate is not

  • Not a codec. It does not encode or decode anything. It tells you which codec to use and emits a receipt. The actual codec implementation lives in turbo-quant, fib-quant, gpu-backend, or quant-codec-core.
  • Not a transport. It does not move bytes. It produces a routing decision as data.
  • Not a scheduling system. It does not decide when to compress; it decides which codec.

Test coverage

  • 22 integration tests in tests/ covering:
    • Policy evaluation determinism (same input → same output, 1000×)
    • Profile digest stability (BLAKE3, byte-exact across runs)
    • Admissibility class transitions (Critical → Raw, etc.)
    • Fallback receipt generation (lossy → raw when admissibility is Critical)
    • Degradation triggers (size threshold crossing, accuracy requirement increase, latency budget shrink)
    • ContentType routing (Text → Q4, Embedding → Turbo, etc.)
  • 3 doctests in the lib.rs doc-comment.
  • 1 example: examples/basic_policy.rs.
  • cargo test clean, cargo clippy --all-targets -- -D warnings clean.

Performance

evaluate() is O(1) — it inspects the request and policy, applies the routing rules, and produces a decision. No loops over the payload (the payload isn't even passed in). A million evaluate() calls on a single core completes in under 1 second.

The receipt construction is also O(1) — BLAKE3 of fixed-size struct fields, no I/O.

MSRV

Rust 1.75 (2021 edition). Stable-only features.

Dependencies

  • serde, thiserror, chrono, sha2, blake3.
  • tokio (dev only, for the integration tests).
  • lib profile specifies [profile.release] with opt-level = 3, lto = true, codegen-units = 1 — this is called out in the Cargo.toml because the routing is on the hot path for the RecursiveIntell memory stack.

Zero platform-specific code. Zero FFI. Zero async runtime in the production crate (the routing function is sync).

License

MIT. See LICENSE-MIT for the full text.

Changelog

See CHANGELOG.md for the release history.

Where it's used

quant-governor is the policy layer for:

  • turbo-quant — the experimental vector compression sidecar (TurboQuant, PolarQuant, QJL).
  • semantic-memory — every projection import routes through evaluate() to decide which sidecar to use.
  • scr-runtime-compression — the cross-runtime compression scheduler uses quant-governor to pick the right codec per payload.

Any system that does governed compression (legal/medical/audit contexts where "we compressed your data" must be a typed, receipted operation) can adopt quant-governor directly.