Skip to main content

Crate numr

Crate numr 

Source
Expand description

§numr

High-performance numerical computing for Rust with multi-backend GPU acceleration.

numr provides n-dimensional arrays (tensors), linear algebra, FFT, and automatic differentiation - with the same API across CPU, CUDA, and WebGPU backends.

§Why numr?

  • Multi-backend: Same code runs on CPU, CUDA, and WebGPU
  • No vendor lock-in: Native kernels, not cuBLAS/MKL wrappers
  • Pure Rust: No Python runtime, no FFI overhead, single binary deployment
  • Autograd included: Reverse-mode automatic differentiation built-in
  • Sparse tensors: CSR, CSC, COO formats with GPU support

§Features

  • Tensors: N-dimensional arrays with broadcasting, slicing, views
  • Linear algebra: Matmul, LU, QR, SVD, Cholesky, eigendecomposition
  • FFT: Fast Fourier transforms (1D, 2D, ND)
  • Element-wise ops: Full set of math functions
  • Reductions: Sum, mean, max, min, argmax, argmin along axes
  • Multiple dtypes: f64, f32, f16, bf16, fp8, integers, bool

§Quick Start

use numr::prelude::*;

let device = CpuDevice;
let a = Tensor::<CpuRuntime>::from_slice(&[1.0, 2.0, 3.0, 4.0], &[2, 2], &device);
let b = Tensor::<CpuRuntime>::from_slice(&[5.0, 6.0, 7.0, 8.0], &[2, 2], &device);

let c = &a + &b;
let d = a.matmul(&b)?;

§Feature Flags

  • cpu (default): CPU backend
  • cuda: NVIDIA CUDA backend
  • wgpu: Cross-platform GPU via WebGPU
  • rayon (default): Multi-threaded CPU operations
  • f16: Half-precision floats (F16, BF16)
  • sparse: Sparse tensor formats (CSR, CSC, COO)

Modules§

algorithm
Algorithm contracts for runtime backends
autograd
Automatic differentiation (autograd)
dtype
Data type system for numr tensors.
error
Error types for numr
ops
Tensor operations
prelude
Prelude module for convenient imports
runtime
Runtime backends for tensor computation
sparse
Sparse tensor support for numr
tensor
Tensor types and operations

Macros§

dispatch_dtype
Macro for runtime dtype dispatch to typed operations.

Type Aliases§

DefaultRuntime
Default runtime based on enabled features