Crate cubek_reduce

Crate cubek_reduce 

Source
Expand description

This provides different implementations of the reduce algorithm which can run on multiple GPU backends using CubeCL.

A reduction is a tensor operation mapping a rank R tensor to a rank R - 1 by agglomerating all elements along a given axis with some binary operator. This is often also called folding.

This crate provides a main entrypoint as the reduce function which allows to automatically perform a reduction for a given instruction implementing the ReduceInstruction trait and a given ReduceStrategy. It also provides implementation of the ReduceInstruction trait for common operations in the [instructions] module. Finally, it provides many reusable primitives to perform different general reduction algorithms in the [primitives] module.

Re-exports§

pub use crate::launch::ReduceStrategy;
pub use components::args::init_tensors;
pub use components::args::init_tensors;
pub use components::instructions::ReduceFamily;
pub use components::instructions::ReduceInstruction;
pub use components::precision::ReducePrecision;
pub use launch::ReduceDtypes;
pub use launch::reduce_kernel;
pub use launch::reduce_kernel;
pub use routines::shared_sum::shared_sum;
pub use components::config::*;

Modules§

components
launch
routines

Enums§

ReduceError
This error should be caught and properly handled.

Functions§

reduce
Reduce the given axis of the input tensor using the instruction Inst and write the result into output.