Expand description
SIMD-optimized element-wise operations
This module provides high-performance SIMD-accelerated implementations of common tensor operations using SciRS2’s SIMD capabilities.
§Performance Features
- Vectorized operations using AVX2/AVX-512 when available
- Aligned memory access for optimal performance
- Cache-friendly memory access patterns
- Automatic fallback to scalar for small tensors
§Usage
These functions are used internally by CpuExecutor to accelerate element-wise operations for large tensors.