Crate strength_reduce
source ·Expand description
strength_reduce
implements integer division and modulo via “arithmetic strength reduction”
This results in much better performance when computing repeated divisions or modulos.
Example:
use strength_reduce::StrengthReducedU64;
let mut my_array: Vec<u64> = (0..500).collect();
let divisor = 3;
let modulo = 14;
// slow naive division and modulo
for element in &mut my_array {
*element = (*element / divisor) % modulo;
}
// fast strength-reduced division and modulo
let reduced_divisor = StrengthReducedU64::new(divisor);
let reduced_modulo = StrengthReducedU64::new(modulo);
for element in &mut my_array {
*element = (*element / reduced_divisor) % reduced_modulo;
}
The intended use case for StrengthReducedU## is for use in hot loops like the one in the example above: A division is repeated hundreds of times in a loop, but the divisor remains unchanged. In these cases, strength-reduced division and modulo are 5x-10x faster than naive division and modulo.
Benchmarking suggests that for u8, u16, and u32, on a x64 windows PC, using StrengthReducedU## is always faster than naive division or modulo, even when not used inside a loop. For u64, it’s slower if it’s only used a few times, due to nontrivial setup costs, with a break-even point around 10-20.
For divisors that are known at compile-time, the compiler is already capable of performing arithmetic strength reduction.
But if the divisor is only known at runtime, the compiler cannot optimize away the division. strength_reduce
is designed
for situations where the divisor is not known until runtime.
strength_reduce
is #![no_std]
The optimizations that this library provides are inherently dependent on architecture, compiler, and platform, so test before you use.