pub struct RepeatBackwardPlan<T: Element, const N: usize> { /* private fields */ }Expand description
repeat backward plan.
Adjoint of crate::RepeatPlan:
dx[c_in] = Σ_k dy[c_in + k · input_shape] — every dy cell that
maps back to c_in under the FW’s modulo contributes. One thread
per dx cell sweeps the prod(repeats[d]) contributing cells.
f16 / bf16 accumulate in f32 internally; f32 / f64 accumulate in
their native dtype.
When to use: BW for RepeatPlan.
Dtypes: {f32, f64, f16, bf16}.
Shape limits: rank in [1, 8]; repeats[d] ≥ 1.
Workspace: none.
Precision guarantee: deterministic (no atomics — one thread per output cell, deterministic iteration order). Conservatively reported as not bit-stable because summation order matters in FP semantics and a future refactor might reorder the inner loop.
Implementations§
Source§impl<T: Element, const N: usize> RepeatBackwardPlan<T, N>
impl<T: Element, const N: usize> RepeatBackwardPlan<T, N>
Sourcepub fn select(
_stream: &Stream,
desc: &RepeatBackwardDescriptor<N>,
_pref: PlanPreference,
) -> Result<Self>
pub fn select( _stream: &Stream, desc: &RepeatBackwardDescriptor<N>, _pref: PlanPreference, ) -> Result<Self>
Pick a kernel for desc.
Sourcepub fn can_implement(&self, args: &RepeatBackwardArgs<'_, T, N>) -> Result<()>
pub fn can_implement(&self, args: &RepeatBackwardArgs<'_, T, N>) -> Result<()>
Validate args.
Sourcepub fn workspace_size(&self) -> usize
pub fn workspace_size(&self) -> usize
Workspace size in bytes. Always 0.
Sourcepub fn precision_guarantee(&self) -> PrecisionGuarantee
pub fn precision_guarantee(&self) -> PrecisionGuarantee
Numerical guarantees.