Skip to main content

Module faer_ndarray

Module faer_ndarray 

Source

Structs§

FaerArrayView
FaerCholeskyFactor
FaerColView
FaerLblt
$LBL^\top$ decomposition
FaerLdlt
$L D L^\top$ decomposition
FaerLlt
$L L^\top$ decomposition

Enums§

FaerLinalgError
FaerSymmetricFactor

Traits§

FaerCholesky
FaerEigh
FaerQr
FaerSolve
SolveCore extension trait
FaerSvd

Functions§

array1_to_col_matmut
array2_to_matmut
default_rrqr_rank_alpha
factorize_symmetricwith_fallback
Factorize a symmetric system with LLT -> LDLT -> LBLT fallback.
fast_ab
Compute A * B using faer’s SIMD-optimized GEMM. For A of shape (n, p) and B of shape (p, q), this computes the (n, q) result. Uses zero-copy views when possible.
fast_ab_into
Write faer matmul result A*B directly into a pre-allocated ndarray Array2. Avoids the intermediate faer::Mat allocation and mat_to_array copy.
fast_ata
Compute A^T * A using faer’s SIMD-optimized GEMM. This is MUCH faster than ndarray’s .t().dot() for matrices where n > ~100.
fast_ata_into
Compute A^T * A into a pre-allocated output buffer. out must be shaped (p, p) where A is (n, p).
fast_atb
Compute A^T * B using faer’s SIMD-optimized GEMM. For A of shape (n, p) and B of shape (n, q), this computes the (p, q) result. Uses zero-copy views when possible.
fast_atb_with_parallelism
Compute A^T * B with an explicit faer parallelism policy for callers that are already running independent products in an outer Rayon task.
fast_atv
Compute A^T * v using faer’s SIMD-optimized GEMV. For A of shape (n, p) and v of shape (n,), this computes the (p,) result.
fast_atv_into
Compute A^T * v into a pre-allocated output buffer. out must be length p where A is (n, p) and v is length n.
fast_av
Compute A * v using faer’s SIMD-optimized GEMV. For A of shape (n, p) and v of shape (p,), this computes the (n,) result.
fast_av_into
Compute A * v into a pre-allocated output buffer. out must be length n where A is (n, p) and v is length p.
fast_av_view_into
Compute A * v into a pre-allocated ArrayViewMut1 slice. Like fast_av_into but accepts a writable slice rather than &mut Array1, so callers can write directly into a sub-range of a larger buffer without intermediate allocation.
fast_joint_hessian_2x2
Compute the 2×2 block joint Hessian in a single streaming pass: [X_a^T diag(w_aa) X_a, X_a^T diag(w_ab) X_b] [X_b^T diag(w_ab) X_a, X_b^T diag(w_bb) X_b]
fast_xt_diag_x
Compute A^T * diag(W) * A using streaming chunks to avoid O(n*p) allocation.
fast_xt_diag_x_with_parallelism
Compute A^T * diag(W) * A with an explicit faer parallelism policy for callers that parallelize multiple independent Hessian blocks externally.
fast_xt_diag_y
Compute A^T * diag(W) * B using streaming chunks.
ldlt_rook
Computes a symmetric-indefinite rook-pivoted LBL^T factorization.
rrqr_nullspace_basis
Compute an orthonormal basis for null(a^T) using column-pivoted QR on a.