Numina - Backend-Agnostic Array Library for Rust
A safe, efficient array library with ndarray-compatible operations, designed as the foundation for high-performance computing backends in Rust.
Features
- Safe & Ergonomic: Memory-safe array operations with Rust's guarantees
- Type Safe: Runtime shape and data type validation
- Backend Agnostic:
NdArraytrait enables multiple backends (CPU, GPU, remote) - Extensible Types: Support for custom data types (BFloat16, quantized types)
- Zero Dependencies: Pure Rust implementation
Quick Start
use ;
// Create arrays
let a = from_slice?;
let b = from_slice?;
// Operations work on any NdArray backend
let c = add?; // Element-wise addition
let d = matmul?; // Matrix multiplication
let total = sum?; // Sum all elements
let row_sums = sum?; // Sum along axis
Core Types
Array<T>: Typed N-dimensional arrays for CPU operationsCpuBytesArray: Byte-based N-dimensional arrays for CPU operationsNdArray: Backend-agnostic trait for all array operationsShape: Multi-dimensional array dimensionsDType: Data types (f32, f64, i8-i64, u8-u64, bool, custom types)
Design Philosophy: Numina provides the low-level backend infrastructure. High-level tensor APIs (like Tensor types) are provided by dependent crates (for example, laminax-types) which build upon Numina's NdArray trait.
DType ID Table
Stable dtype IDs are serialized in Lamina IR and Laminax runtime. IDs are explicit and frozen.
FP8 formats follow the E4M3FN (finite-only, max 448) and E5M2 (Inf/NaN, max 57344) conventions with round-to-nearest-even (RTNE) rounding.
| Name | DType | ID | Bytes | Storage Bits | Align |
|---|---|---|---|---|---|
| float16 | F16 |
1 | 2 | 16 | 2 |
| float32 | F32 |
2 | 4 | 32 | 4 |
| float64 | F64 |
3 | 8 | 64 | 8 |
| bfloat16 | BF16 |
4 | 2 | 16 | 2 |
| bfloat8 | BF8 |
5 | 1 | 8 | 1 |
| float8_e4m3fn | F8E4M3FN |
6 | 1 | 8 | 1 |
| float8_e5m2 | F8E5M2 |
7 | 1 | 8 | 1 |
| complex32 | Complex32 |
50 | 4 | 32 | 2 |
| complex64 | Complex64 |
51 | 8 | 64 | 4 |
| complex128 | Complex128 |
52 | 16 | 128 | 8 |
| int8 | I8 |
10 | 1 | 8 | 1 |
| int16 | I16 |
11 | 2 | 16 | 2 |
| int32 | I32 |
12 | 4 | 32 | 4 |
| int64 | I64 |
13 | 8 | 64 | 8 |
| uint8 | U8 |
20 | 1 | 8 | 1 |
| uint16 | U16 |
21 | 2 | 16 | 2 |
| uint32 | U32 |
22 | 4 | 32 | 4 |
| uint64 | U64 |
23 | 8 | 64 | 8 |
| bool | Bool |
30 | 1 | 8 | 1 |
| quantized_i4 | QI4 |
40 | 1 | 4 | 1 |
| quantized_u8 | QU8 |
41 | 1 | 8 | 1 |
Custom Data Types
use ;
// Brain Float 16
let bf16 = from_f32;
assert_eq!;
// 8-bit quantized
let scale = 0.01;
let q8 = quantize;
assert!;
// 4-bit quantized (2 values per byte)
let q4 = pack;
assert_eq!; // 87.5% memory savings!
Multiple Backends
use ;
// Different backend implementations
let typed_array = from_slice?;
let bytes = .iter.flat_map.collect;
let byte_array = new;
// Same operations work on all backends
let sum1 = add?;
let sum2 = add?;
// Cross-backend operations are fully supported
assert_eq!;
Architecture
src/
├── array/ # NdArray trait and CPU implementations
├── dtype/ # Data type system and custom types
├── ops.rs # Mathematical operations
├── reductions.rs # Reduction operations
├── sorting.rs # Sorting and searching
└── lib.rs # Library interface
Status
Implemented:
- Array operations (add, mul, matmul, reductions)
- Multiple backends via NdArray trait (Array, CpuBytesArray)
- Custom data types (BFloat16, QuantizedU8, QuantizedI4)
- Shape manipulation (reshape, transpose)
- Sorting and searching operations
- 49 tests passing
Planned:
- Broadcasting, advanced indexing, linear algebra
- File I/O, statistics
- Memory-mapped arrays
- More custom data types (FP8, FP4, NF4)
Integration
Numina serves as one of the core libraries for Laminax, enabling high-performance GPU/CPU computing.