fib-quant
fib-quant is an experimental Rust implementation of the core radial-angular vector quantization path described in the FibQuant paper:
Namyoon Lee and Yongjune Kim, "FibQuant: Universal Vector Quantization for Random-Access KV-Cache Compression", arXiv:2605.11478.
This crate is meant for research, conformance experiments, and integration prototypes. It does not claim production KV-cache serving readiness, paper benchmark reproduction, fused GPU kernel support, or superiority over other KV-cache quantization systems.
What Is Implemented
- profile-validated vector normalization and fixed-rate block quantization;
- deterministic stored rotations for reproducible encode/decode receipts;
- spherical-Beta source sampling and radial quantile construction;
- Fibonacci spiral, Fibonacci sphere, and Roberts-Kronecker direction generation;
- deterministic Lloyd-Max refinement with non-worsening fallback;
- fixed-width bit packing for code indices;
- fail-closed digests and compression receipts;
- corruption, profile, schema, and payload validation tests;
- an optional, default-off
kvfeature with typed KV-cache contracts and a CPU reference encode/decode path.
What Is Not Claimed
This release does not claim:
- production KV-cache compressor readiness;
- model-level quality validation;
- local reproduction of the FibQuant paper's perplexity, memory, or throughput numbers;
- vLLM, FlashInfer, TensorRT-LLM, Hugging Face, or CUDA integration;
- fused attention-kernel decompression;
- default-on compression in any parent project.
Benchmark results from the arXiv paper should be cited as paper results unless this repository contains local benchmark receipts with enough metadata to reproduce them.
Install
[]
= "0.1.0-alpha.1"
The KV-cache reference contracts are experimental and default-off:
[]
= { = "0.1.0-alpha.1", = ["kv"] }
Minimal Example
use ;
Release Posture
0.1.0-alpha.1 is an alpha research release. The public API is intentionally narrow and validation-heavy. Profiles reject unsupported dimensions, rates, methods, training sample counts, schema markers, norm formats, and source modes before allocation-heavy paths run.
The optional kv feature adds typed contracts, role-aware policy decisions, fixed-page metadata, receipts, synthetic attention-quality helpers, and CPU reference paths. It remains an experimental reference layer, not a production serving backend.
Validation
The release gate is:
When working from a dirty or parent-workspace overlay, use the checklist in RELEASE_CHECKLIST.md and publish only from a clean Git checkout.
Documentation
docs/compression/FIBQUANT_MATH_CONFORMANCE.mdrecords implemented math and validation boundaries.docs/compression/FIBQUANT_BENCHMARK_PLAN.mddefines the receipts required before benchmark claims.docs/compression/FIBQUANT_PUBLICATION_NONCLAIMS.mdlists forbidden public claims.docs/kv/KV_PRODUCTION_READINESS_REPORT.mddescribes the current default-off KV reference status.
Citation
If this crate is useful in your work, cite both this implementation and the FibQuant paper. A CITATION.cff file is included for citation tooling.
License
Licensed under the Apache License, Version 2.0. See LICENSE.