simdly
🚀 A high-performance Rust library that leverages SIMD (Single Instruction, Multiple Data) instructions for fast vectorized computations. This library provides efficient implementations of mathematical operations using modern CPU features.
✨ Features
- 🚀 SIMD Optimized: Leverages AVX2 instructions for 256-bit vector operations
- 💾 Memory Efficient: Supports both aligned and unaligned memory access patterns
- 🔧 Generic Traits: Provides consistent interfaces across different SIMD implementations
- 🛡️ Safe Abstractions: Wraps unsafe SIMD operations in safe, ergonomic APIs
- ⚡ Performance: Optimized for high-throughput numerical computations
🏗️ Architecture Support
Currently Supported
- x86/x86_64 with AVX2 (256-bit vectors)
Planned Support
- SSE (128-bit vectors for older x86 processors)
- ARM NEON (128-bit vectors for ARM/AArch64)
📦 Installation
Add simdly to your Cargo.toml:
[]
= "0.1.3"
For optimal performance, enable AVX2 support:
[]
= ["-C", "target-feature=+avx2"]
🚀 Quick Start
use F32x8;
use ;
Working with Partial Data
use F32x8;
use ;
// Handle arrays smaller than 8 elements
let data = ; // Only 3 elements
let vec = from_slice;
let mut output = ;
unsafe
// Only first 3 elements are written
📊 Performance
simdly can provide significant performance improvements for numerical computations:
- Up to 8x faster operations using AVX2 256-bit vectors
- Memory bandwidth optimization through aligned memory access
- Cache-friendly processing patterns
Compilation Flags
For maximum performance, compile with:
RUSTFLAGS="-C target-feature=+avx2"
Or add to your Cargo.toml:
[]
= "fat"
= 1
🔧 Usage Examples
Processing Large Arrays
use F32x8;
use ;
Memory-Aligned Operations
use F32x8;
use ;
use ;
// Allocate 32-byte aligned memory for optimal performance
let layout = from_size_align.unwrap;
let aligned_ptr = unsafe ;
// Verify alignment
assert!;
// Use aligned operations for best performance
let data = ;
unsafe
// Clean up
unsafe ;
📚 Documentation
- 📖 API Documentation - Complete API reference
- 🚀 Getting Started Guide - Detailed usage examples and tutorials
- ⚡ Performance Tips - Optimization strategies and best practices
🛠️ Development
Prerequisites
- Rust 1.77 or later
- x86/x86_64 processor with AVX2 support
- Linux, macOS, or Windows
Building
Testing
Benchmarking
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
Areas for Contribution
- Additional SIMD instruction set support (SSE, ARM NEON)
- Mathematical operations implementation
- Performance optimizations
- Documentation improvements
- Testing and benchmarks
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- Built with Rust's excellent SIMD intrinsics
- Inspired by high-performance computing libraries
- Thanks to the Rust community for their valuable feedback
📈 Roadmap
- SSE support for older x86 processors
- ARM NEON support for ARM/AArch64
- Additional mathematical operations
- Automatic SIMD instruction set detection
- WebAssembly SIMD support
Made with ❤️ and ⚡ by Mahdi Tantaoui