Skip to main content

Crate qjl_sketch

Crate qjl_sketch 

Source
Expand description

QJL sketch — fast approximate attention scoring via sign-based vector compression.

Compresses key/value vectors using random projection sign hashing (QJL) and min-max scalar quantization, then stores them in append-only mmap-backed stores. Scoring is approximate inner product via packed sign bits; batched store-level scoring can be GPU-accelerated with the gpu feature.

§Feature flags

  • serde — enables Serialize/Deserialize on all public structs and streaming store export/import.
  • gpu — enables WGPU GPU-accelerated KeyStore::scores (batched float × sign).

Modules§

codebook
error
math
mse_quant
outliers
quantize
quantizer
rotation
score
sketch
store
values