Expand description
This provides different implementations of the reduce algorithm which can run on multiple GPU backends using CubeCL.
A reduction is a tensor operation mapping a rank R
tensor to a rank R - 1
by agglomerating all elements along a given axis with some binary operator.
This is often also called folding.
This crate provides a main entrypoint as the reduce
function which allows to automatically
perform a reduction for a given instruction implementing the ReduceInstruction
trait and a given [StrategyStrategy
].
It also provides implementation of the ReduceInstruction
trait for common operations in the instructions
module.
Finally, it provides many reusable primitives to perform different general reduction algorithms in the primitives
module.
Re-exports§
pub use instructions::Reduce;
pub use instructions::ReduceInstruction;
Modules§
Structs§
Enums§
Functions§
- reduce
- Reduce the given
axis
of theinput
tensor using the instructionInst
and write the result intooutput
.