Expand description
§numr
High-performance numerical computing for Rust with multi-backend GPU acceleration.
numr provides n-dimensional arrays (tensors), linear algebra, FFT, and automatic differentiation - with the same API across CPU, CUDA, and WebGPU backends.
§Why numr?
- Multi-backend: Same code runs on CPU, CUDA, and WebGPU
- No vendor lock-in: Native kernels, not cuBLAS/MKL wrappers
- Pure Rust: No Python runtime, no FFI overhead, single binary deployment
- Autograd included: Reverse-mode automatic differentiation built-in
- Sparse tensors: CSR, CSC, COO formats with GPU support
§Features
- Tensors: N-dimensional arrays with broadcasting, slicing, views
- Linear algebra: Matmul, LU, QR, SVD, Cholesky, eigendecomposition
- FFT: Fast Fourier transforms (1D, 2D, ND)
- Element-wise ops: Full set of math functions
- Reductions: Sum, mean, max, min, argmax, argmin along axes
- Multiple dtypes: f64, f32, f16, bf16, fp8, integers, bool
§Quick Start
ⓘ
use numr::prelude::*;
let device = CpuDevice;
let a = Tensor::<CpuRuntime>::from_slice(&[1.0, 2.0, 3.0, 4.0], &[2, 2], &device);
let b = Tensor::<CpuRuntime>::from_slice(&[5.0, 6.0, 7.0, 8.0], &[2, 2], &device);
let c = &a + &b;
let d = a.matmul(&b)?;§Feature Flags
cpu(default): CPU backendcuda: NVIDIA CUDA backendwgpu: Cross-platform GPU via WebGPUrayon(default): Multi-threaded CPU operationsf16: Half-precision floats (F16, BF16)sparse: Sparse tensor formats (CSR, CSC, COO)
Modules§
- algorithm
- Algorithm contracts for runtime backends
- autograd
- Automatic differentiation (autograd)
- dtype
- Data type system for numr tensors.
- error
- Error types for numr
- ops
- Tensor operations
- prelude
- Prelude module for convenient imports
- runtime
- Runtime backends for tensor computation
- sparse
- Sparse tensor support for numr
- tensor
- Tensor types and operations
Macros§
- dispatch_
dtype - Macro for runtime dtype dispatch to typed operations.
Type Aliases§
- Default
Runtime - Default runtime based on enabled features