1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
//! Layout / arch / epilogue / activation tags shared across kernel
//! families.
//!
//! These are pure descriptor enums that don't carry generic parameters;
//! they appear in plan descriptors, in [`crate::KernelSku`] (TBD) /
//! `GemmSku`, and in selector preference fields.
/// Layout SKU. Describes the row/column orientation of A, B, C, and D
/// for matrix-multiply-shaped kernels.
///
/// **Intentionally NOT `#[non_exhaustive]`** — the GEMM layout space
/// is essentially closed in practice (row-major / column-major
/// permutations of A, B, C/D); the two wired variants cover the
/// dispatch space `baracuda-cutlass` selects against. New variants
/// would be a deliberate breaking change with a major-version bump.
/// Compute capability bucket the selected kernel was compiled for.
///
/// **Intentionally NOT `#[non_exhaustive]`** — the cutlass GEMM
/// dispatchers exhaustively match on this enum to pick per-arch
/// kernel SKUs; adding a new arch (Blackwell `Sm100a` is tracked in
/// the ROADMAP) deserves to surface as a build break across every
/// match site so each can decide whether to JIT-forward or add a
/// dedicated variant. New variants are a deliberate
/// breaking-change event.
/// Epilogue applied after the matrix-multiply accumulation.
///
/// The four `Bias*` variants share one kernel family: they all fuse the
/// bias add into the output epilogue and additionally apply the named
/// activation function before the store. `BiasRelu`, `BiasGelu`, and
/// `BiasSilu` therefore deliver the full `y = activation(W·x + b)`
/// transformer-Linear pipeline in a single kernel pass — no extra memory
/// traffic vs plain `Bias`.
// EpilogueKind is intentionally NOT `#[non_exhaustive]` — the cutlass
// GEMM dispatchers exhaustively match `(LayoutSku, EpilogueKind)` to
// pick per-fused-epilogue kernel SKUs. Adding a new epilogue (e.g.
// `BiasTanh`, `BiasSigmoid`) deserves to surface as a build break
// across every match site so each branch can choose to wire it or
// reject. New variants are a deliberate breaking-change event.
/// Activation functions implemented by the `Bias*Activation`
/// [`EpilogueKind`] variants. Surfaced for telemetry and selector
/// logic; the kernel selection itself is driven by the enum variant.
///
/// **Intentionally NOT `#[non_exhaustive]`** — paired with
/// [`EpilogueKind`] which is also left exhaustive. Adding a new
/// activation requires shipping a matching `Bias<Activation>` epilogue
/// kernel, which is a deliberate breaking-change event.