1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT license.
//! Block-transposed SIMD kernels for multi-vector distance computation.
//!
//! This module provides a SIMD-accelerated implementation that uses block-transposed
//! memory layout for **query** vectors (instead of documents), with documents remaining
//! in row-major format.
//!
//! # Memory Layout
//!
//! - **Query**: Block-transposed (`GROUP` vectors per block, dimensions contiguous
//! within each block). The block size is determined by the kernel's `A_PANEL`.
//! - **Document**: Row-major (standard [`MatRef`](crate::multi_vector::MatRef) format).
pub
pub
// ── Tile budget ──────────────────────────────────────────────────
/// Cache budgets fed to the tile planner.
///
/// `Default` returns the production budgets derived from hardcoded L1/L2
/// cache-size estimates and fixed fractions.
// ── Kernel trait ─────────────────────────────────────────────────
/// SIMD micro-kernel for the [`tiled_reduce`](tiled_reduce::tiled_reduce) loop.
///
/// The kernel only sees already-converted data: storage-layout to
/// kernel-layout conversion is handled at tile boundaries by
/// [`ConvertTo`](layouts::ConvertTo), so implementors can assume their input
/// pointers reference `<Self::Left as Layout>::Element` /
/// `<Self::Right as Layout>::Element` directly.
///
/// # Safety
///
/// Implementors must respect the per-method `# Safety` contracts on
/// [`full_panel`](Self::full_panel) and [`partial_panel`](Self::partial_panel).
unsafe