turboquant-rs 0.3.1

TurboQuant KV-Cache Quantization — 3-bit compression with zero accuracy loss (Zandieh et al., ICLR 2026)
Documentation
1
2
3
4
5
6
7
8
9
10
//! CUDA kernel wrappers for TurboQuant operations.
//!
//! All modules are gated behind `#[cfg(feature = "cuda")]`.

#[cfg(feature = "cuda")]
pub mod attention;
#[cfg(feature = "cuda")]
pub(crate) mod ffi;
#[cfg(feature = "cuda")]
pub(crate) mod quantize;