simd_vector
SIMD vector types for x86-64 in pure stable Rust.
Vec4— 4×f32, SSE4.1 + FMA3Vec8— 8×f32, AVX2 + FMA3
Features
- Arithmetic:
+,-,*,/(vec×vec, vec×f32, f32×vec) splat,abs,neg,sqrt,floor,dot,mul_add(FMA)Sum,Index,From/Intoarray,Clone,Copy,Debug,PartialEq- Transcendentals:
sin,cos,exp— ported from SLEEF
Transcendental accuracy
sin and cos use SLEEF's xsinf_u1 / xcosf_u1 algorithms (ULP < 1 variants). exp uses SLEEF's xexpf.
| Function | Algorithm | ULP error | Range |
|---|---|---|---|
sin |
Cody-Waite + Payne-Hanek + double-float polynomial | ≤ 1.0 | all finite f32 |
cos |
Cody-Waite + Payne-Hanek + double-float polynomial | ≤ 1.0 | all finite f32 |
exp |
ln(2) range reduction + degree-6 polynomial + ldexp | ≤ 1.0 | all finite f32 |
Key implementation details:
- Range reduction for
sin/cos: Cody-Waite (3-constant) for |x| < 125, Payne-Hanek table-based (rempif) for larger arguments - Double-float arithmetic: FMA-based error-free transformations for high precision in the polynomial evaluation
- Edge cases: NaN/Inf propagation, sin(-0) = -0, exp(-inf) = 0, exp(inf) = inf
- Polynomial coefficients, constants (
PI_A2f,PI_B2f,PI_C2f,L2Uf,L2Lf, etc.), and the 416-entrySleef_rempitabsptable are taken directly from SLEEF
Required CPU features
SSE4.1, AVX2, FMA3. Enabled globally via .cargo/config.toml:
[]
= ["-C", "target-feature=+sse4.1,+avx2,+fma"]
Usage
use ;
let a = Vec4;
let b = Vec4;
let c = a + b; // [6.0, 8.0, 10.0, 12.0]
let d = a.dot; // 70.0
let s = a.sin; // per-lane sin
let e = splat.exp; // [e, e, e, e, e, e, e, e]
Tests
189 tests covering all operations, edge cases (NaN, Inf, -0.0, subnormals), and ULP sweep verification over the full range.
cargo test
Coverage (via cargo llvm-cov):
Filename Regions Missed Cover Functions Missed Cover Lines Missed Cover
lib.rs 142 0 100.00% 6 0 100.00% 61 0 100.00%
vec4.rs 892 0 100.00% 48 0 100.00% 437 0 100.00%
vec8.rs 915 0 100.00% 48 0 100.00% 441 0 100.00%
TOTAL 1949 0 100.00% 102 0 100.00% 939 0 100.00%
License
MIT