numeris 0.5.12

Pure-Rust numerical algorithms library — high performance with SIMD support while also supporting no-std for embedded and WASM targets.
Documentation
//! # numeris
//!
//! Pure-Rust numerical algorithms library — high performance with SIMD support
//! (NEON, SSE2, AVX, AVX-512) while also supporting no-std for embedded and WASM
//! targets. Similar in scope to SciPy.
//!
//! **Documentation & examples:** <https://numeris-rs.dev/>
//!
//! ## Quick start
//!
//! ```
//! use numeris::{Matrix, Vector};
//!
//! // Solve a linear system Ax = b
//! let a = Matrix::new([
//!     [2.0_f64, 1.0, -1.0],
//!     [-3.0, -1.0, 2.0],
//!     [-2.0, 1.0, 2.0],
//! ]);
//! let b = Vector::from_array([8.0, -11.0, -3.0]);
//! let x = a.solve(&b).unwrap(); // x = [2, 3, -1]
//! ```
//!
//! ## Modules
//!
//! - [`matrix`] — Fixed-size `Matrix<T, M, N>` with const-generic dimensions.
//!   Stack-allocated `[[T; M]; N]` column-major storage (matches LAPACK conventions).
//!   `Matrix::new()` accepts row-major input and transposes internally.
//!   Includes arithmetic, indexing, norms, block operations, and iteration.
//!   [`Vector<T, N>`] is a type alias for an N×1 column matrix (matching
//!   nalgebra convention).
//!
//! - [`dynmatrix`] — Heap-allocated `DynMatrix<T>` with runtime dimensions
//!   (requires `alloc` feature, included with `std`). `Vec<T>` column-major storage
//!   (`col * nrows + row`). `from_rows()` accepts row-major data (transposes
//!   internally); `from_slice()` accepts column-major data directly.
//!   Implements [`MatrixRef`] / [`MatrixMut`], so all linalg free functions work
//!   automatically. [`DynVector<T>`] newtype for single-index vector access.
//!   Includes `DynLu`, `DynCholesky`, `DynQr`, `DynSvd`, `DynSymmetricEigen`,
//!   `DynSchur` wrapper structs.
//!
//! - [`linalg`] — LU (partial pivoting), Cholesky (A = LL^H), QR (Householder),
//!   SVD (Householder bidiagonalization + Golub-Kahan implicit-shift QR),
//!   symmetric/Hermitian eigendecomposition (Householder tridiagonalization +
//!   implicit QR with Wilkinson shift), and real Schur decomposition (Hessenberg
//!   reduction + Francis double-shift QR). Each provides `solve()`, `inverse()`,
//!   `det()`, `eigenvalues()`, etc. Free functions operate on
//!   `&mut impl MatrixMut<T>` for in-place use; wrapper structs offer a
//!   higher-level API. Convenience methods on both `Matrix` and `DynMatrix`.
//!
//! - [`ode`] — Fixed-step RK4 and 7 adaptive Runge-Kutta solvers (RKF45,
//!   RKTS54, RKV65, RKV87, RKV98, RKV98NoInterp, RKV98Efficient). PI step-size
//!   controller with dense output / interpolation. Supports both vector and
//!   matrix state (e.g., state transition matrix propagation). RODAS4 L-stable
//!   Rosenbrock method for stiff systems (vector state, user-supplied or
//!   finite-difference Jacobians). Requires `ode` feature.
//!
//! - [`optim`] — Optimization: scalar root finding ([`optim::brent`],
//!   [`optim::newton_1d`]), BFGS quasi-Newton minimization ([`optim::minimize_bfgs`]),
//!   Gauss-Newton ([`optim::least_squares_gn`]) and Levenberg-Marquardt
//!   ([`optim::least_squares_lm`]) nonlinear least squares. Finite-difference
//!   Jacobian and gradient utilities. Fixed-size routines are no-alloc;
//!   dynamic-dimension counterparts ([`optim::minimize_bfgs_dyn`],
//!   [`optim::least_squares_gn_dyn`], [`optim::least_squares_lm_dyn`],
//!   [`optim::finite_difference_gradient_dyn`],
//!   [`optim::finite_difference_jacobian_dyn`]) operate on
//!   [`DynVector`] / [`DynMatrix`] and require the `alloc` feature.
//!   Requires `optim` feature.
//!
//! - [`control`] — Digital IIR filters: [`control::Biquad`] second-order section and
//!   [`control::BiquadCascade`] for cascaded filters. Design functions for Butterworth
//!   and Chebyshev Type I lowpass/highpass. [`control::Pid`] discrete-time PID controller
//!   with anti-windup. [`control::lead_compensator`] and [`control::lag_compensator`] for
//!   compensator design via bilinear transform. PID tuning via [`control::FopdtModel`]
//!   (Ziegler-Nichols, Cohen-Coon, SIMC) and [`control::ziegler_nichols_ultimate`].
//!   No `complex` feature dependency. Requires `control` feature.
//!
//! - [`estimate`] — State estimation: [`estimate::Ekf`] (Extended Kalman Filter),
//!   [`estimate::Ukf`] (Unscented Kalman Filter), [`estimate::SrUkf`] (Square-Root UKF),
//!   [`estimate::Ckf`] (Cubature Kalman Filter), [`estimate::rts_smooth`] (RTS smoother),
//!   and [`estimate::BatchLsq`] (batch least-squares). Closure-based dynamics and
//!   measurement models, Joseph-form covariance update, Merwe-scaled sigma points.
//!   EKF and BatchLsq are fully no-std; sigma-point filters and RTS require `alloc`.
//!   Requires `estimate` feature.
//!
//! - [`interp`] — Interpolation: [`interp::LinearInterp`], [`interp::HermiteInterp`],
//!   [`interp::LagrangeInterp`] (barycentric), [`interp::CubicSpline`] (natural BCs),
//!   and [`interp::BilinearInterp`] (2D rectangular grid).
//!   Fixed-size (const N, stack-allocated, no-std) and dynamic variants (`Dyn*`, requires
//!   `alloc`). Out-of-bounds evaluations extrapolate. Requires `interp` feature.
//!
//! - [`imageproc`] — 2D image processing on `DynMatrix` buffers. Full
//!   toolkit: convolution ([`imageproc::convolve2d`],
//!   [`imageproc::convolve2d_separable`]), blurs / sharpening / gradients /
//!   Laplacian / LoG, order-statistic filters (quickselect and
//!   Huang sliding-histogram for u16), morphology (Van Herk max/min plus
//!   opening / closing / top-hat / black-hat / morphology gradient),
//!   integral image and local mean / variance / stddev, thresholding
//!   (binary / Otsu / adaptive), Canny edge detection, Harris and
//!   Shi-Tomasi corners, difference of Gaussians and Gaussian pyramid,
//!   connected-components labeling (SAUF union-find, 4- or 8-connectivity,
//!   with per-component area / bbox / centroid / second moments, and
//!   optional `DynMatrix<u32>` or row-major `Vec<u32>` labels image),
//!   geometric utilities (flip, rotate 90°/180°/270°, pad, crop,
//!   resize bilinear and nearest). See [`imageproc`] module for details.
//!   Requires `imageproc` feature (implies `alloc`).
//!
//! - [`special`] — Special functions: [`special::gamma`], [`special::lgamma`],
//!   [`special::digamma`], [`special::beta`] / [`special::lbeta`],
//!   regularized incomplete gamma ([`special::gamma_inc`] / [`special::gamma_inc_upper`]),
//!   regularized incomplete beta ([`special::betainc`]),
//!   and error functions ([`special::erf`] / [`special::erfc`]).
//!   Generic over `FloatScalar` (f32/f64), fully no-std. Requires `special` feature.
//!
//! - [`quad`] — Numerical quadrature (integration): [`quad::gauss_legendre`] (N-point
//!   Gauss-Legendre, N=1..10,15,20), [`quad::adaptive_simpson`] (automatic subdivision),
//!   [`quad::trapezoid`] and [`quad::simpson`] (composite rules). All no-alloc.
//!   Requires `quad` feature.
//!
//! - [`stats`] — Statistical distributions with [`stats::ContinuousDistribution`] and
//!   [`stats::DiscreteDistribution`] traits. Continuous: [`stats::Normal`],
//!   [`stats::Uniform`], [`stats::Exponential`], [`stats::Gamma`], [`stats::Beta`],
//!   [`stats::ChiSquared`], [`stats::StudentT`]. Discrete: [`stats::Bernoulli`],
//!   [`stats::Binomial`], [`stats::Poisson`]. Built-in [`stats::Rng`] (xoshiro256++)
//!   with `sample()` / `sample_array()` on every distribution.
//!   Requires `stats` feature (implies `special`).
//!
//! - [`quaternion`] — Unit quaternion for 3D rotations. Scalar-first `[w, x, y, z]`.
//!   Construct from axis-angle, Euler angles, or rotation matrices. Supports
//!   Hamilton product, vector rotation, SLERP, and conversion back to matrices.
//!
//! - [`traits`] — Element trait hierarchy:
//!   - [`Scalar`] — all matrix elements (`Copy + PartialEq + Debug + Zero + One + Num`)
//!   - [`FloatScalar`] — real floats (`Scalar + Float`), used by quaternions
//!   - [`LinalgScalar`] — real floats and complex numbers, used by decompositions and norms
//!   - [`MatrixRef`] / [`MatrixMut`] — generic read/write access for algorithms
//!
//! ## Complex matrices
//!
//! Enable the `complex` feature to use decompositions with `Complex<f32>` /
//! `Complex<f64>`. Cholesky generalizes to Hermitian (A = LL^H), QR uses
//! complex Householder reflections, and norms return real values. Zero overhead
//! for real-only code paths.
//!
//! ## Cargo features
//!
//! | Feature   | Default  | Description |
//! |-----------|----------|-------------|
//! | `std`     | yes      | Implies `alloc`. Hardware FPU via system libm |
//! | `alloc`   | via std  | `DynMatrix` / `DynVector` (heap-allocated, runtime-sized) |
//! | `ode`     | yes      | ODE integration (RK4, adaptive solvers) |
//! | `optim`   | no       | Optimization (root finding, BFGS, Gauss-Newton, LM) |
//! | `control` | no       | Digital IIR filters, PID, lead/lag compensators, PID tuning |
//! | `estimate`| no       | State estimation (EKF, UKF). Implies `alloc` |
//! | `interp`  | no       | Interpolation (linear, Hermite, Lagrange, cubic spline, bilinear 2D) |
//! | `imageproc` | no     | 2D image processing: filters, morphology, integral image, thresholding, Canny, corners, DoG, pyramid, geometric. Implies `alloc` |
//! | `quad`    | no       | Numerical quadrature (Gauss-Legendre, adaptive Simpson, composite rules) |
//! | `special` | no       | Special functions (gamma, beta, erf, incomplete gamma/beta) |
//! | `stats`   | no       | Statistical distributions (Normal, Gamma, etc.) with sampling. Implies `special` |
//! | `libm`    | baseline | Pure-Rust software float fallback |
//! | `complex` | no       | `Complex<f32>` / `Complex<f64>` support via `num-complex` |
//! | `nalgebra`| no       | Conversions between numeris and nalgebra types |
//! | `serde`   | no       | Serialize/deserialize all types via serde |
//! | `rayon`   | no       | Multi-threaded parallelism on runtime-sized paths (e.g. dynamic finite-difference Jacobians, most `imageproc` filters). Implies `std` |
//! | `all`     | no       | All features |
//!
//! ## Parallelism
//!
//! SIMD is always-on and parallelizes *within* a core; the optional `rayon`
//! feature parallelizes *across* cores, and the two compose. It is opt-in
//! (rayon needs `std`, so it is never part of the no-std baseline) and purely
//! additive — builds without it are unchanged. Only heap-backed, runtime-sized
//! paths with independent, disjoint output columns are parallelized: dynamic
//! finite-difference Jacobians (the separate `_par` routines
//! [`optim::finite_difference_jacobian_dyn_par`],
//! [`optim::finite_difference_gradient_dyn_par`]) and most `imageproc` filters
//! (convolution/blur, rank & median, morphology, resize, local statistics).
//! Small fixed-size [`Matrix`] operations and order-sensitive reductions are
//! never parallelized. Each routine gates on total work (so small inputs stay
//! sequential) and writes to disjoint slices, so results are identical
//! regardless of thread count. Speedups run ~2.5–4× on large images.
//!
//! The feature is purely additive — it never changes an existing signature
//! (the parallel optim routines are *new* `_par` functions; the sequential ones
//! keep their `FnMut` bound).

#![cfg_attr(not(feature = "std"), no_std)]
// Coefficient tables (quadrature nodes/weights, Runge-Kutta tableaux, Lanczos
// constants, …) are written at full published precision. The extra digits beyond
// f64 are harmless (the compiler rounds to the nearest representable value), so
// allow the lint crate-wide rather than truncating documented source values.
#![allow(clippy::excessive_precision)]
// Numerical code legitimately favors index-based loops: 2D access (`a[i][j]`,
// `A[k][j]`), parallel indexing of several arrays by the same index, and
// `f(i, j)` element constructors are clearer with explicit indices than with
// zipped iterators. `needless_range_loop` is a known false positive here.
#![allow(clippy::needless_range_loop)]
// Iterative solvers (BFGS, Gauss-Newton, LM, …) keep evaluation counters
// (`f_evals`, `j_evals`) that are *not* loop counters — they increment
// conditionally / multiple times per iteration (line search, etc.), so they
// cannot be replaced by `enumerate()`.
#![allow(clippy::explicit_counter_loop)]
// Low-level numeric kernels (SIMD matmul with strides, Van Herk passes, ODE
// `integrate` entry points) genuinely need many parameters (dimensions, scratch
// buffers, callbacks); factoring them into structs would only obscure the math.
#![allow(clippy::too_many_arguments)]

#[cfg(feature = "alloc")]
#[cfg_attr(all(test, not(feature = "std")), macro_use)]
extern crate alloc;

#[macro_use]
mod macros;
pub mod prelude;

#[cfg(feature = "alloc")]
pub mod dynmatrix;
pub mod linalg;
pub mod matrix;
mod simd;
// The `par` module backs the parallel paths: `imageproc` (always, sequential or
// parallel) and the `optim` `_par` routines (only under `rayon`).
#[cfg(any(
    feature = "imageproc",
    all(feature = "optim", feature = "alloc", feature = "rayon")
))]
mod par;
// Marker trait used in public `imageproc` signatures; an implementation detail
// of the `rayon` feature, hidden from the docs.
#[cfg(feature = "imageproc")]
#[doc(hidden)]
pub use par::MaybeSync;
#[cfg(feature = "control")]
pub mod control;
#[cfg(feature = "estimate")]
pub mod estimate;
#[cfg(feature = "imageproc")]
pub mod imageproc;
#[cfg(feature = "interp")]
pub mod interp;
#[cfg(feature = "nalgebra")]
mod nalgebra_interop;
#[cfg(feature = "ode")]
pub mod ode;
#[cfg(feature = "optim")]
pub mod optim;
#[cfg(feature = "quad")]
pub mod quad;
pub mod quaternion;
#[cfg(feature = "serde")]
mod serde_impl;
#[cfg(feature = "special")]
pub mod special;
#[cfg(feature = "stats")]
pub mod stats;
pub mod traits;

#[cfg(feature = "alloc")]
pub use dynmatrix::{
    DynMatrix, DynMatrixf32, DynMatrixf64, DynMatrixi32, DynMatrixi64, DynMatrixu32, DynMatrixu64,
    DynVector, DynVectorf32, DynVectorf64, DynVectori32, DynVectori64, DynVectoru32, DynVectoru64,
};
#[cfg(all(feature = "alloc", feature = "complex"))]
pub use dynmatrix::{DynMatrixz32, DynMatrixz64, DynVectorz32, DynVectorz64};
pub use matrix::aliases::{
    Matrix1, Matrix1x2, Matrix1x3, Matrix1x4, Matrix1x5, Matrix1x6, Matrix2, Matrix2x1, Matrix2x3,
    Matrix2x4, Matrix2x5, Matrix2x6, Matrix3, Matrix3x1, Matrix3x2, Matrix3x4, Matrix3x5,
    Matrix3x6, Matrix4, Matrix4x1, Matrix4x2, Matrix4x3, Matrix4x5, Matrix4x6, Matrix5, Matrix5x1,
    Matrix5x2, Matrix5x3, Matrix5x4, Matrix5x6, Matrix6, Matrix6x1, Matrix6x2, Matrix6x3,
    Matrix6x4, Matrix6x5, Vector1, Vector2, Vector4, Vector5, Vector6,
};
pub use matrix::vector::{Vector, Vector3};
pub use matrix::Matrix;
pub use quaternion::Quaternion;
pub use traits::{FloatScalar, LinalgScalar, MatrixMut, MatrixRef, Scalar};

#[cfg(feature = "complex")]
pub use num_complex::Complex;