avx-arrow
Zero-copy columnar format with scientific extensions, SIMD acceleration, and native compression.
🚀 Features
- 11 Primitive Arrays: Int8-64, UInt8-64, Float32/64, Boolean, UTF-8
- 4 Scientific Arrays: Quaternions, Complex64, Tensor4D, Spinors
- 25+ Compute Operations: Aggregations, filters, comparisons, sorting, arithmetic (SIMD)
- SIMD Acceleration: AVX2-optimized operations up to 35x faster
- Native Compression: RLE, Delta, Dictionary, Bit-Packing (125x compression!)
- Zero External Dependencies: Only
byteorderrequired - avxDB Native: Direct integration with avxDB
- Production Ready: 80+ tests passing, proven benchmarks
🎯 Unique in the World
avx-arrow is the only columnar format with:
- Native Scientific Types: QuaternionArray (SLERP), ComplexArray (FFT), Tensor4D (GR), Spinors (QM)
- Native Compression: 125x RLE, 16x Bit-Packing, 4x Delta - zero external dependencies
- AVX2 SIMD: 35x speedup for compute operations
📦 Installation
[]
= "0.1"
🔥 Quick Start
use ;
use Int64Array;
use *;
// Create schema
let schema = new;
// Create arrays
let ids = from;
let values = from;
// Compute operations
let sum = sum_f64;
let mean = mean_f64.unwrap;
let filtered = filter_f64?;
println!;
🧪 Scientific Computing
use *;
// Quaternion arrays for spacecraft orientation
let q1 = from_axis_angle;
let q2 = from_axis_angle;
let array1 = new;
let array2 = new;
// SLERP interpolation for smooth rotation
let interpolated = array1.slerp.unwrap;
// Complex arrays for FFT
let signal = new;
let magnitudes = signal.magnitude;
let phases = signal.phase;
🗜️ Native Compression (Zero External Dependencies!)
use *;
// RLE: 125x compression for repeated values
let data = vec!;
let encoded = encode.unwrap;
// 10000 bytes -> 80 bytes!
// Delta: 4x compression for timestamps
let timestamps: = .map.collect;
let encoded = encode_i64.unwrap;
// Bit-Packing: 16x compression for small integers
let small_ints: = .map.collect;
let bit_width = detect_bit_width; // 4 bits
let packed = pack.unwrap;
// Dictionary: Optimal for low cardinality
let mut encoder = new;
for i in 0..10000
let = encoder.finish;
Compression Benchmarks
| Codec | Best For | Compression Ratio | Example |
|---|---|---|---|
| RLE | Repeated values | 125x | [1,1,1,...] |
| Bit-Pack | Small integers (0-15) | 16x | Flags, counters |
| Delta | Sequential data | 4x | Timestamps, IDs |
| Dictionary | Low cardinality | 1-10x | Categories, enums |
All codecs are 100% native Rust - no external dependencies!
⚡ SIMD Performance
avx-arrow uses AVX2 intrinsics for hardware-accelerated operations with proven speedups:
use *;
let data = from;
// Automatically uses SIMD when AVX2 is available
let sum = sum_f64; // 4.24x faster than scalar
📊 Benchmarks (100K-1M elements)
Basic Operations:
| Operation | Size | Scalar | SIMD | Speedup |
|---|---|---|---|---|
| Sum | 100K | 61.4μs | 14.5μs | 4.24x |
| Add | 10K | 38.8μs | 4.4μs | 8.81x |
| Multiply | 100 | 856ns | 24.4ns | 35x |
| Subtract | 1K | 4.64μs | 611ns | 7.59x |
| Divide | 10K | 76.3μs | 34.7μs | 2.20x |
| Sqrt | 1M | 8.67ms | 4.98ms | 1.74x |
| FMA | 10K | 54.9μs | 9.43μs | 5.82x |
Complex Pipelines (3 operations):
| Size | Scalar | SIMD | Speedup |
|---|---|---|---|
| 10K | 99.3μs | 24.7μs | 4.02x |
| 100K | 1.03ms | 586μs | 1.75x |
| 1M | 12.0ms | 10.8ms | 1.11x |
Memory Throughput:
| Elements | Scalar | SIMD | Speedup |
|---|---|---|---|
| 100K | 61.4μs | 14.5μs | 4.24x |
| 1M | 721μs | 292μs | 2.47x |
Note: Benchmarks run on Intel AVX2 CPU. SIMD excels at small-medium datasets (100-100K). For 10M+ elements, consider parallel processing.
🎓 Examples
See examples/ directory:
basic.rs- Arrays and RecordBatchscientific.rs- Quaternions, Complex, Tensorscompression.rs- Native compression codecs (125x!)ipc.rs- Serialization (coming soon)
Run with:
🧬 Use Cases
- Aerospace: Spacecraft orientation tracking with quaternions
- Signal Processing: FFT analysis with complex arrays
- Physics: Relativistic simulations with tensors
- Quantum Computing: State vectors with spinors
- Data Analytics: High-performance columnar analytics
🛠️ Features
[]
= "0.1"
= ["scientific", "compression", "ipc"]
scientific(default): Scientific array typescompression: Compression supportipc: Arrow IPC formatavxdb: avxDB integration
📈 Roadmap
- Primitive arrays (Int8-64, UInt8-64, Float32/64)
- Scientific arrays (Quaternion, Complex, Tensor4D, Spinor)
- Compute kernels (sum, mean, filter, sort, arithmetic)
- SIMD acceleration (AVX2 with sub, div, sqrt, fma)
- Native compression (RLE, Delta, Dictionary, Bit-Packing) 🆕
- Comprehensive benchmarks (35x compute, 125x compression)
- Arrow IPC format compatibility
- GPU acceleration (CUDA/ROCm)
- Distributed computing support
- AVX-512 support for next-gen CPUs
🤝 Contributing
Contributions welcome! Please open an issue or PR.
📄 License
Dual licensed under MIT OR Apache-2.0.
🌟 Credits
Built with ❤️ by avilaops for the Brazilian scientific computing community.
Status: v0.2.0 - 80+ tests passing ✅ | 35x SIMD ✅ | 125x Compression ✅