Skip to main content

Module reduce

etensor_core::backends::cpu

Module reduce

Expand description

High-performance CPU memory reduction kernels.

Reductions are bandwidth-bound (read N elements, write 1 scalar). Single-threaded loops with LLVM auto-vectorization already saturate the memory bus. Rayon is not used.

Functions§

max_all: Executes a global max reduction, finding the largest single value in the tensor.
mean_all: Executes a global mean reduction, calculating the average of all elements.
sum_all: Executes a global sum reduction, collapsing the entire tensor into a single scalar value.