1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
//! # oxillama-quant
//!
//! Quantization kernel library for OxiLLaMa.
//!
//! Provides dequantization and fused matmul operations for all GGUF
//! quantization formats. Each format has three implementation tiers:
//!
//! 1. **Reference (naive)** — Pure scalar Rust for correctness.
//! 2. **Portable SIMD** — Cross-platform vectorization.
//! 3. **Platform SIMD** — AVX2, AVX-512, NEON intrinsics.
//!
//! ## Supported Formats (planned)
//!
//! | Category | Types |
//! |----------|-------|
//! | Legacy | Q4_0, Q4_1, Q5_0, Q5_1, Q8_0, Q8_1 |
//! | K-Quants | Q2_K, Q3_K, Q4_K, Q5_K, Q6_K |
//! | I-Quants | IQ1_S, IQ1_M, IQ2_XXS, IQ2_XS, IQ2_S, IQ3_XXS, IQ3_S, IQ4_XS, IQ4_NL |
//! | 1-Bit | Q1_0_G128 (from OxiBonsai) |
//! | Float | F16, BF16, F32 |
pub use ;
pub use ;
pub use LoraAdapter;
pub use ;
pub use QuantKernel;
pub use ;