Expand description
Quantization techniques for memory compression
This module provides tiered quantization strategies as specified in ADR-001:
| Quantization | Compression | Use Case |
|---|---|---|
| Scalar (u8) | 4x | Warm data (40-80% access) |
| Int4 | 8x | Cool data (10-40% access) |
| Product | 8-16x | Cold data (1-10% access) |
| Binary | 32x | Archive (<1% access) |
§Performance Optimizations v2
- SIMD-accelerated distance calculations for scalar (int8) quantization
- SIMD popcnt for binary hamming distance
- 4x loop unrolling for better instruction-level parallelism
- Separate accumulator strategy to reduce data dependencies
Structs§
- Binary
Quantized - Binary quantization (32x compression)
- Int4
Quantized - Int4 quantization (8x compression)
- Product
Quantized - Product quantization (8-16x compression)
- Scalar
Quantized - Scalar quantization to int8 (4x compression)
Traits§
- Quantized
Vector - Trait for quantized vector representations