Skip to main content

Module layout

Module layout 

Source
Expand description

Unified Quantized Vector layout — the cache-aligned superset format that absorbs binary / ternary (BitNet 1.58) / 4-bit scalar / residual codecs without polymorphic indirection in the hot path.

128-byte alignment matches AVX-512 cache-line pair and avoids false sharing under thread-per-core execution.

§Outlier bitmask limit

The outlier_bitmask in QuantHeader supports up to 64 outlier dimensions per vector. Callers that need to mark more than 64 outliers must bucket by 64-dim windows; multi-window support is out of scope for this module.

Structs§

QuantHeader
32-byte interleaved header preceding the packed bit array.
UnifiedQuantizedVector
Owned, 128-byte-aligned unified quantized vector buffer.
UnifiedQuantizedVectorRef
Zero-copy borrowed view into a UnifiedQuantizedVector buffer.

Enums§

QuantMode
Quantization mode discriminator stored in the header.

Functions§

target_size
Compute the buffer size (rounded up to a 128-byte multiple) required to hold a UnifiedQuantizedVector with the given parameters.