Crate strength_reduce

Expand description

strength_reduce implements integer division and modulo via “arithmetic strength reduction”

This results in much better performance when computing repeated divisions or modulos.

Example:

use strength_reduce::StrengthReducedU64;
 
let mut my_array: Vec<u64> = (0..500).collect();
let divisor = 3;
let modulo = 14;

// slow naive division and modulo
for element in &mut my_array {
    *element = (*element / divisor) % modulo;
}

// fast strength-reduced division and modulo
let reduced_divisor = StrengthReducedU64::new(divisor);
let reduced_modulo = StrengthReducedU64::new(modulo);
for element in &mut my_array {
    *element = (*element / reduced_divisor) % reduced_modulo;
}

The intended use case for StrengthReducedU## is for use in hot loops like the one in the example above: A division is repeated hundreds of times in a loop, but the divisor remains unchanged. In these cases, strength-reduced division and modulo are 5x-10x faster than naive division and modulo.

Benchmarking suggests that for u8, u16, and u32, on a x64 windows PC, using StrengthReducedU## is always faster than naive division or modulo, even when not used inside a loop. For u64, it’s slower if it’s only used a few times, due to nontrivial setup costs, with a break-even point around 10-20.

For divisors that are known at compile-time, the compiler is already capable of performing arithmetic strength reduction. But if the divisor is only known at runtime, the compiler cannot optimize away the division. strength_reduce is designed for situations where the divisor is not known until runtime.

strength_reduce is #![no_std]

The optimizations that this library provides are inherently dependent on architecture, compiler, and platform, so test before you use.