turboquant 0.1.1

Implementation of Google's TurboQuant algorithm for vector quantization
Documentation
# Contributing

## Scope

TurboQuant is a Rust library for KV-cache quantization research. Contributions should prioritize correctness, typed errors, reproducible benchmarks, and honest documentation over feature count.

## Prerequisites

- Rust `1.87.0`
- `cargo fmt`, `clippy`, `llvm-tools-preview`
- Optional for trace export: Python 3 with `torch`, `transformers`, `numpy`, and `safetensors`
- Optional for real-model ONNX export: Python 3 with the pinned packages in `scripts/requirements-real-model.txt`

## Local Setup

```bash
cargo fmt -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo test --all-features
cargo check --examples --all-features
```

For coverage and dependency review:

```bash
cargo llvm-cov --workspace --all-features --summary-only
cargo audit
```

## Benchmark Smoke Tests

Synthetic:

```bash
cargo run --release --example benchmark -- --workload synthetic --quick
```

Trace:

```bash
cargo run --release --example benchmark -- \
  --workload trace \
  --trace traces/example.safetensors
```

Real-model compare run:

```bash
cargo run --release --example benchmark -- \
  --workload real-model \
  --real-model-dir artifacts/smollm2-135m-onnx \
  --prompt "Explain KV-cache quantization in one sentence." \
  --real-eval-mode compare \
  --bits 4 \
  --real-key-strategy prod \
  --max-new-tokens 16
```

Experimental WGPU batch path:

```bash
cargo run --release --example benchmark --features gpu -- --workload synthetic --quick --backend wgpu
```

## Real-Model Export Setup

```bash
python3 -m venv .venv
. .venv/bin/activate
pip install -r scripts/requirements-real-model.txt
python3 scripts/export_hf_decoder_onnx.py \
  --preset smollm2-135m-instruct \
  --output-dir artifacts/smollm2-135m-onnx
```

## CI-Safe and Manual Tests

CI-safe:

- `cargo test --all-features`
  Includes the tiny checked-in ONNX fixture.

Manual heavier smoke test:

```bash
TURBOQUANT_REAL_MODEL_DIR=artifacts/smollm2-135m-onnx \
  cargo test --all-features manual_exported_real_model_smoke_test -- --ignored --nocapture
```

## Contribution Rules

- Keep public APIs explicit about preconditions and failure modes.
- Do not silently clamp, truncate, or reinterpret malformed external input.
- Add or update tests for every bug fix and every new documented behavior.
- If you change measured behavior, update the benchmark path or documentation in the same pass.
- Keep experimental surfaces labeled as experimental until their guarantees materially improve.
- Do not relabel synthetic or trace workloads as real-model runs.

## Code Review Expectations

- Formatting and clippy must be clean.
- All tests must pass with `--all-features`.
- New user-visible behavior must be documented in `README.md` and `CHANGELOG.md`.
- Non-obvious design choices should be explained either in code comments or in `ARCHITECTURE.md`.

## Release Hygiene

Before tagging a release candidate:

- update `CHANGELOG.md`
- update the status/limitations sections in `README.md`
- rerun lint, tests, coverage, audit, and benchmark smoke commands
- rerun at least one manual real-model smoke test on an exported bundle