pub struct BatchedGemmArgs<'a, T>where
T: Element,{
pub a: MatrixRef<'a, T>,
pub stride_a: i64,
pub b: MatrixRef<'a, T>,
pub stride_b: i64,
pub c: Option<MatrixRef<'a, T>>,
pub stride_c: i64,
pub d: MatrixMut<'a, T>,
pub stride_d: i64,
pub alpha: <T as Element>::Scalar,
pub beta: <T as Element>::Scalar,
}Expand description
Per-launch arguments for a
BatchedGemmPlan::run call.
stride_* fields are in elements, not bytes — matching CUTLASS’s
GemmBatched API. Pass 0 for stride if the same matrix should be
reused across all batches (broadcast).
Fields§
§a: MatrixRef<'a, T>Left input — base pointer for batch 0.
stride_a: i64Element offset between consecutive A batches.
b: MatrixRef<'a, T>Right input — base pointer for batch 0.
stride_b: i64Element offset between consecutive B batches.
c: Option<MatrixRef<'a, T>>Optional accumulation source.
stride_c: i64Element offset between consecutive C batches. Ignored when c is None.
d: MatrixMut<'a, T>Output — base pointer for batch 0.
stride_d: i64Element offset between consecutive D batches.
alpha: <T as Element>::Scalarα multiplier (shared across batches). Scalar type matches
T::Scalar — f32 for f16/bf16/f32/F32Strict, f64 for
f64.
beta: <T as Element>::Scalarβ multiplier (shared across batches). Forced to 0 internally
when c is None.