ferray-ufunc 0.2.1

Universal functions and SIMD-accelerated elementwise operations for ferray
Documentation

ferray-ufunc

SIMD-accelerated universal functions for the ferray scientific computing library.

What's in this crate

  • 40+ elementwise operations: sin, cos, exp, log, sqrt, abs, floor, ceil, etc.
  • Binary operations: add, sub, mul, div, pow with broadcasting
  • CORE-MATH correctly-rounded transcendentals (< 0.5 ULP from mathematical truth)
  • exp_fast() — Even/Odd Remez decomposition, ~30% faster than CORE-MATH at ≤1 ULP accuracy
  • Portable SIMD via pulp (SSE2/AVX2/AVX-512/NEON) on stable Rust
  • Scalar fallback with FERRAY_FORCE_SCALAR=1 environment variable
  • SIMD paths for f32, f64, i32, i64 on all contiguous inner loops

Performance

Uses CORE-MATH for correctly-rounded transcendentals by default (≤0.5 ULP). For throughput-sensitive workloads, exp_fast() provides faithfully-rounded results (≤1 ULP) with ~30% better throughput via a table-free Even/Odd Remez decomposition that auto-vectorizes cleanly.

Usage

use ferray_ufunc::{sin, exp, exp_fast, add};
use ferray_core::prelude::*;

let a = Array1::<f64>::linspace(0.0, 6.28, 1000)?;
let b = sin(&a)?;

// Correctly rounded (≤0.5 ULP)
let c = exp(&b)?;

// Fast mode (≤1 ULP, ~30% faster)
let c_fast = exp_fast(&b)?;

This crate is re-exported through the main ferray crate.

License

MIT OR Apache-2.0