Expand description
§simdly
🚀 A high-performance Rust library that leverages SIMD (Single Instruction, Multiple Data) instructions for fast vectorized computations. This library provides efficient implementations of mathematical operations using modern CPU features.
§Features
- SIMD Optimized: Leverages AVX2 (256-bit) and NEON (128-bit) instructions for vector operations
- Memory Efficient: Supports both aligned and unaligned memory access patterns
- Generic Traits: Provides consistent interfaces across different SIMD implementations
- Safe Abstractions: Wraps unsafe SIMD operations in safe, ergonomic APIs
- Cross-Platform: Supports both x86/x86_64 and ARM/AArch64 architectures
- Performance: Optimized for high-throughput numerical computations
§Architecture Support
Currently supports:
- x86/x86_64 with AVX2 (256-bit vectors)
- ARM/AArch64 with NEON (128-bit vectors)
Future support planned for:
- SSE (128-bit vectors for older x86 processors)
§Usage
The library provides traits for SIMD operations that automatically detect and use the best available instruction set on the target CPU.
§High-Level SIMD Usage
use simdly::simd::SimdMath;
// Vectorized mathematical operations - works on both AVX2 and NEON
let angles = vec![0.0, std::f32::consts::PI / 4.0, std::f32::consts::PI / 2.0];
let cosines = angles.cos(); // SIMD accelerated
// 2D distance calculations
let x_coords = vec![3.0, 5.0, 8.0, 7.0];
let y_coords = vec![4.0, 12.0, 15.0, 24.0];
let distances = x_coords.hypot(y_coords); // [5.0, 13.0, 17.0, 25.0]
// Power calculations
let bases = vec![2.0, 3.0, 4.0, 5.0];
let exponents = vec![2.0, 2.0, 2.0, 2.0];
let powers = bases.pow(exponents); // [4.0, 9.0, 16.0, 25.0]§Parallel SIMD Operations
For maximum performance on large datasets, use the parallel SIMD methods that automatically select between single-threaded and multi-threaded implementations based on array size:
use simdly::simd::SimdMath;
// Large dataset - automatically uses parallel SIMD
let large_data: Vec<f32> = (0..1_000_000).map(|i| i as f32 * 0.001).collect();
let results = large_data.par_cos(); // Multi-threaded SIMD
// Small dataset - automatically uses regular SIMD
let small_data = vec![1.0, 2.0, 3.0, 4.0];
let results = small_data.par_sin(); // Single-threaded SIMD
// Works with all math functions
let sqrt_results = large_data.par_sqrt();
let exp_results = large_data.par_exp();
let abs_results = large_data.par_abs();§Performance Considerations
- Memory Alignment: Use aligned memory when possible for optimal performance
- Batch Processing: Process data in chunks that match SIMD vector sizes
- CPU Features: Enable appropriate target features during compilation
Modules§
- simd
- SIMD operations and platform-specific implementations. SIMD (Single Instruction, Multiple Data) operations module.
Constants§
- PARALLEL_
SIMD_ THRESHOLD - Minimum array size where parallel SIMD operations become beneficial.
- SIMD_
THRESHOLD - Threshold below which scalar operations outperform SIMD.