ferray-ufunc
SIMD-accelerated universal functions for the ferray scientific computing library.
What's in this crate
- 40+ elementwise operations:
sin,cos,exp,log,sqrt,abs,floor,ceil, etc. - Binary operations:
add,sub,mul,div,powwith broadcasting - CORE-MATH correctly-rounded transcendentals (< 0.5 ULP from mathematical truth)
exp_fast()— Even/Odd Remez decomposition, ~30% faster than CORE-MATH at ≤1 ULP accuracy- Portable SIMD via
pulp(SSE2/AVX2/AVX-512/NEON) on stable Rust - Scalar fallback with
FERRAY_FORCE_SCALAR=1environment variable - SIMD paths for f32, f64, i32, i64 on all contiguous inner loops
Performance
Uses CORE-MATH for correctly-rounded transcendentals by default (≤0.5 ULP). For throughput-sensitive workloads, exp_fast() provides faithfully-rounded results (≤1 ULP) with ~30% better throughput via a table-free Even/Odd Remez decomposition that auto-vectorizes cleanly.
Usage
use ;
use *;
let a = linspace?;
let b = sin?;
// Correctly rounded (≤0.5 ULP)
let c = exp?;
// Fast mode (≤1 ULP, ~30% faster)
let c_fast = exp_fast?;
This crate is re-exported through the main ferray crate.
License
MIT OR Apache-2.0