Expand description
Unified Quantized Vector layout — the cache-aligned superset format that absorbs binary / ternary (BitNet 1.58) / 4-bit scalar / residual codecs without polymorphic indirection in the hot path.
128-byte alignment matches AVX-512 cache-line pair and avoids false sharing under thread-per-core execution.
§Outlier bitmask limit
The outlier_bitmask in QuantHeader supports up to 64 outlier
dimensions per vector. Callers that need to mark more than 64 outliers
must bucket by 64-dim windows; multi-window support is out of scope for
this module.
Structs§
- Quant
Header - 32-byte interleaved header preceding the packed bit array.
- Unified
Quantized Vector - Owned, 128-byte-aligned unified quantized vector buffer.
- Unified
Quantized Vector Ref - Zero-copy borrowed view into a
UnifiedQuantizedVectorbuffer.
Enums§
- Quant
Mode - Quantization mode discriminator stored in the header.
Functions§
- target_
size - Compute the buffer size (rounded up to a 128-byte multiple) required to
hold a
UnifiedQuantizedVectorwith the given parameters.