Crate cubek_reduce

Expand description

This provides different implementations of the reduce algorithm which can run on multiple GPU backends using CubeCL.

A reduction is a tensor operation mapping a rank R tensor to a rank R - 1 by agglomerating all elements along a given axis with some binary operator. This is often also called folding.

This crate provides a main entrypoint as the reduce function which allows to automatically perform a reduction for a given instruction implementing the ReduceInstruction trait and a given ReduceStrategy. It also provides implementation of the ReduceInstruction trait for common operations in the [instructions] module. Finally, it provides many reusable primitives to perform different general reduction algorithms in the [primitives] module.

Re-exports§

pub use crate::launch::ReduceStrategy;
pub use components::args::init_tensors;
pub use components::args::init_tensors;
pub use components::instructions::ReduceFamily;
pub use components::instructions::ReduceInstruction;
pub use components::precision::ReducePrecision;
pub use launch::ReduceDtypes;
pub use launch::reduce_kernel;
pub use launch::reduce_kernel;
pub use routines::shared_sum::shared_sum;
pub use components::config::*;

Modules§

components
launch
routines

Enums§

ReduceError: This error should be caught and properly handled.

Functions§

reduce: Reduce the given axis of the input tensor using the instruction Inst and write the result into output.