Expand description
Binary elementwise plan.
Phase 3 trailblazer surface for the baracuda-kernels elementwise op
family (category C from the comprehensive plan). Mirrors the shape
of crate::IntGemmPlan (descriptor + args + select/can_implement/
run/sku/precision_guarantee) but for arbitrary-rank tensors with no
GEMM-style accumulator / epilogue chain.
Today only the Add op on f32 over fully-contiguous tensors of
matching shape is wired — this is the Phase 3 trailblazer SKU. Other
binary ops (BinaryKind::Sub, Mul, Div, …) and other dtypes
(f16, bf16, f64, integer family) join in fanout sessions; the
Add instantiation in baracuda-kernels-sys is the template
pattern they follow.
Broadcasting is supported: operands with stride[d] = 0 on a
broadcast axis route through a strided kernel path that handles
arbitrary per-axis stride (broadcast, transposed views, arbitrary
strided slices). The dispatcher picks contig vs strided at run
time based on is_contiguous() of all three operands.
Structs§
- Binary
Args - Args bundle for a binary elementwise launch.
- Binary
Descriptor - Descriptor for a binary elementwise op.
- Binary
Plan - Binary elementwise plan.