Skip to main content

Module encoding

Module encoding 

Source
Expand description

Vector encodings.

Three compression schemes are available:

  • Int8Quantized – per-vector scalar quantisation to u8, storing the per-vector minimum and scale alongside. Roughly 4x compression vs raw f32; reconstruction error is in the 0.5-1.0 percent range for typical embedding distributions.
  • Fp16 – IEEE 754 half-precision floats. 2x compression; reconstruction error around 0.05 percent.
  • Turbovec – 2/3/4-bit data-oblivious quantisation backed by the turbovec crate’s TurboQuant codec. The compressed payload is held in a per-table SIMD-friendly index; the per-row EncodedVector holds the original f32 bytes so round-trip and rehydration paths remain exact while the in-memory ANN index uses the 8x to 16x compressed packed representation. See turbovec for the on-disk packed layout and the SIMD search kernels.

The Int8Quantized and Fp16 encodings round-trip every vector to within their respective error budgets and preserve dimension count exactly. Turbovec round-trips losslessly at the row layer because the compressed representation lives in the table’s [crate::index::TurboTable] alongside the row store, not in the row payload itself; quantisation loss is exposed through the search-path scoring (see distance_turbovec).

The encoded byte stream produced by encode is self-describing in the EncodedVector wrapper: the dimension count, the codec identifier, and any per-vector parameters (the int8 minimum and scale, for instance) are captured on the EncodedVector struct so an operator can dynvec-cli inspect <id> and see the human-readable form without re-running the codec.

Structs§

EncodedVector
Encoded vector ready to persist or hand to a distance routine.
Fp16
IEEE 754 half-precision encoder.
Int8Quantized
Per-vector scalar quantisation to u8.
Turbovec
turbovec 2/3/4-bit TurboQuant encoder.

Enums§

Codec
Codec identifier.
EncodingError
Errors returned by encoders.

Traits§

Encoder
Encoder trait. Each codec ships exactly one impl.

Functions§

decode_turbovec
Decode a turbovec-encoded blob back to Vec<f32>.
distance_turbovec
Score query against a single turbovec-stored vector at the given Distance metric, using turbovec’s SIMD-accelerated search path against an ephemeral one-element index.
encode_turbovec
Encode a single f32 vector under the turbovec codec at bits bit-width.