1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
//! AVX-512 accelerated quantization kernels (x86_64 only, `simd-avx512` feature).
//!
//! All kernels in this module require the `avx512f` CPU feature and are
//! guarded by `#[target_feature(enable = "avx512f")]` on their inner
//! functions. The [`crate::dispatch::KernelDispatcher`] checks for AVX-512
//! support at runtime before constructing any of these kernels.
//!
//! ## Kernels
//!
//! | Struct | Format | Block size | Block bytes | Throughput vs AVX2 |
//! |--------|--------|-----------|-------------|-------------------|
//! | [`Q4_0Avx512`] | Q4_0 | 32 | 18 | ~2× |
//! | [`Q4_1Avx512`] | Q4_1 | Q4_1 | 32 | 20 | ~2× |
//! | [`Q8_0Avx512`] | Q8_0 | 32 | 34 | ~2× |
//! | [`Q8_1Avx512`] | Q8_1 | 32 | 36 | ~2× |
//! | [`Q2_KAvx512`] | Q2_K | 256 | 84 | ~2× |
//! | [`Q3_KAvx512`] | Q3_K | 256 | 110 | ~2× |
//! | [`Q4_KAvx512`] | Q4_K | 256 | 144 | ~2× |
//! | [`Q5_KAvx512`] | Q5_K | 256 | 176 | ~2× |
//! | [`Q5_1Avx512`] | Q5_1 | 32 | 24 | ~2× |
//! | [`Q6_KAvx512`] | Q6_K | 256 | 210 | ~2× |
//! | [`Q1_0G128Avx512`] | Q1_0_G128 | 128 | 18 | ~2× |
//! | [`Tq1_0Avx512`] | TQ1_0 | 256 | 54 | ~2× |
//! | [`Tq2_0Avx512`] | TQ2_0 | 256 | 66 | ~2× |
//! | [`Q5_0Avx512`] | Q5_0 | 32 | 22 | ~2× |
//! | [`Q8_KAvx512`] | Q8_K | 256 | 292 | ~2× |
//! | [`Iq2XxsAvx512`] | IQ2_XXS | 256 | 66 | ~2× |
//! | [`Iq2XsAvx512`] | IQ2_XS | 256 | 74 | ~2× |
//! | [`Iq3SAvx512`] | IQ3_S | 256 | 110 | ~2× |
//! | [`Iq4XsAvx512`] | IQ4_XS | 256 | 136 | ~2× |
pub use Iq2XsAvx512;
pub use Iq2XxsAvx512;
pub use Iq3SAvx512;
pub use Iq4XsAvx512;
pub use Q1_0G128Avx512;
pub use Q2_KAvx512;
pub use Q3_KAvx512;
pub use Q4_0Avx512;
pub use Q4_1Avx512;
pub use Q4_KAvx512;
pub use Q5_0Avx512;
pub use Q5_1Avx512;
pub use Q5_KAvx512;
pub use Q6_KAvx512;
pub use Q8_0Avx512;
pub use Q8_1Avx512;
pub use Q8_KAvx512;
pub use Tq1_0Avx512;
pub use Tq2_0Avx512;