la-stack 0.4.3

Fast, stack-allocated linear algebra for fixed dimensions
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
# la-stack

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.18158926.svg)](https://doi.org/10.5281/zenodo.18158926)
[![Crates.io](https://img.shields.io/crates/v/la-stack.svg)](https://crates.io/crates/la-stack)
[![Downloads](https://img.shields.io/crates/d/la-stack.svg)](https://crates.io/crates/la-stack)
[![License](https://img.shields.io/crates/l/la-stack.svg)](./LICENSE)
[![Docs.rs](https://docs.rs/la-stack/badge.svg)](https://docs.rs/la-stack)
[![CI](https://github.com/acgetchell/la-stack/actions/workflows/ci.yml/badge.svg)](https://github.com/acgetchell/la-stack/actions/workflows/ci.yml)
[![rust-clippy analyze](https://github.com/acgetchell/la-stack/actions/workflows/rust-clippy.yml/badge.svg)](https://github.com/acgetchell/la-stack/actions/workflows/rust-clippy.yml)
[![codecov](https://codecov.io/gh/acgetchell/la-stack/graph/badge.svg?token=4eKXa5QjuZ)](https://codecov.io/gh/acgetchell/la-stack)
[![Audit dependencies](https://github.com/acgetchell/la-stack/actions/workflows/audit.yml/badge.svg)](https://github.com/acgetchell/la-stack/actions/workflows/audit.yml)
[![Codacy Security Scan](https://github.com/acgetchell/la-stack/actions/workflows/codacy.yml/badge.svg)](https://github.com/acgetchell/la-stack/actions/workflows/codacy.yml)

![la-stack](https://raw.githubusercontent.com/acgetchell/la-stack/main/docs/assets/la-stack.jpg)

Fast, stack-allocated linear algebra for fixed dimensions in Rust.

This crate grew from the need to support [`delaunay`](https://crates.io/crates/delaunay) with fast, stack-allocated linear algebra primitives and algorithms
while keeping the API intentionally small and explicit.

## πŸ“ Introduction

`la-stack` provides a handful of const-generic, stack-backed building blocks:

- `Vector<const D: usize>` for fixed-length `f64` vectors backed by `[f64; D]`
- `Matrix<const D: usize>` for fixed-size square `f64` matrices backed by `[[f64; D]; D]`
- `Lu<const D: usize>` for LU factorization with partial pivoting (solve + det)
- `Ldlt<const D: usize>` for LDLT factorization without pivoting (solve + det; symmetric SPD/PSD)

## ✨ Design goals

- βœ… `Copy` types where possible
- βœ… Const-generic dimensions (no dynamic sizes)
- βœ… `const fn` where possible (compile-time evaluation of determinants, dot products, etc.)
- βœ… Explicit algorithms (LU, solve, determinant)
- βœ… Robust geometric predicates via optional exact arithmetic (`det_sign_exact`, `det_errbound`)
- βœ… Exact linear system solve via optional arbitrary-precision arithmetic (`solve_exact`, strict/rounded f64 conversions)
- βœ… No runtime dependencies by default (optional features may add deps)
- βœ… Stack storage only (no heap allocation in core types)
- βœ… `unsafe` forbidden

See [CHANGELOG.md](https://github.com/acgetchell/la-stack/blob/v0.4.3/CHANGELOG.md)
for release history and
[docs/roadmap.md](https://github.com/acgetchell/la-stack/blob/v0.4.3/docs/roadmap.md)
for current release planning.

## 🚫 Anti-goals

- Bare-metal performance: see [`blas-src`]https://crates.io/crates/blas-src, [`lapack-src`]https://crates.io/crates/lapack-src, [`openblas-src`]https://crates.io/crates/openblas-src
- Comprehensive: use [`nalgebra`]https://crates.io/crates/nalgebra if you need a full-featured library
- Large matrices/dimensions with parallelism: use [`faer`]https://crates.io/crates/faer if you need this
- Alternate floating-point scalar families: `la-stack` supports `f64` and optional exact arithmetic, not `f32` / `f16` APIs

## βœ… Use this crate when

- Your matrices and vectors have small, fixed dimensions known at compile time
- Stack allocation and `Copy` value semantics fit your data flow
- You want explicit LU / LDLT / determinant APIs rather than a broad algebra toolkit
- You need exact determinants, exact determinant signs, or exact linear solves
  for fixed-size systems
- Robust predicates matter for geometry-style workloads near degeneracy
- You prefer a default build with no runtime dependencies

## πŸ”’ Scalar types

The scalar model is intentionally limited to `f64` for floating-point work and
exact rationals behind the optional `"exact"` feature. This matches the crate's
focus on small, robustness-sensitive numerical and computational geometry
workloads. When `f64` precision is insufficient (e.g. near-degenerate geometric
configurations), the optional `"exact"` feature provides arbitrary-precision
arithmetic via `BigRational` (see below).

Lower-precision `f32` / `f16` throughput-oriented workloads are outside the
crate's scope; they usually indicate large-matrix or accelerator-oriented use
cases better served by broader linear-algebra libraries.

## πŸš€ Quickstart

Add this to your `Cargo.toml`:

```toml
[dependencies]
la-stack = "0.4.3"
```

### Feature flags

- `default`: no runtime dependencies
- `exact`: `BigRational` exact determinant and solve APIs
- `bench`: Criterion, nalgebra, and faer for internal benchmarks

Solve a 5Γ—5 system via LU:

```rust
use la_stack::prelude::*;

fn main() -> Result<(), LaError> {
    // This system requires pivoting (a[0][0] = 0), so it's a good LU demo.
    // A = J - I: zeros on diagonal, ones elsewhere.
    let a = Matrix::<5>::try_from_rows([
        [0.0, 1.0, 1.0, 1.0, 1.0],
        [1.0, 0.0, 1.0, 1.0, 1.0],
        [1.0, 1.0, 0.0, 1.0, 1.0],
        [1.0, 1.0, 1.0, 0.0, 1.0],
        [1.0, 1.0, 1.0, 1.0, 0.0],
    ])?;

    let b = Vector::<5>::try_new([14.0, 13.0, 12.0, 11.0, 10.0])?;

    let lu = a.lu(DEFAULT_SINGULAR_TOL)?;
    let x = lu.solve(b)?.into_array();

    // Floating-point rounding is expected; compare with a tolerance.
    let expected = [1.0, 2.0, 3.0, 4.0, 5.0];
    for (x_i, e_i) in x.iter().zip(expected.iter()) {
        assert!((*x_i - *e_i).abs() <= 1e-12);
    }

    Ok(())
}
```

Compute a determinant for a symmetric SPD matrix via LDLT (no pivoting).

For symmetric positive-definite matrices, `LDL^T` is essentially a square-root-free form of the Cholesky decomposition
(you can recover a Cholesky factor by absorbing `sqrt(D)` into `L`):

```rust
use la_stack::prelude::*;

fn main() -> Result<(), LaError> {
    // This matrix is symmetric positive-definite (A = L*L^T) so LDLT works without pivoting.
    let a = Matrix::<5>::try_from_rows([
        [1.0, 1.0, 0.0, 0.0, 0.0],
        [1.0, 2.0, 1.0, 0.0, 0.0],
        [0.0, 1.0, 2.0, 1.0, 0.0],
        [0.0, 0.0, 1.0, 2.0, 1.0],
        [0.0, 0.0, 0.0, 1.0, 2.0],
    ])?;

    let ldlt = match a.ldlt(DEFAULT_SINGULAR_TOL) {
        Ok(ldlt) => ldlt,
        Err(err @ LaError::Asymmetric { row, col, .. }) => {
            eprintln!("LDLT requires symmetry; first mismatch at ({row}, {col})");
            return Err(err);
        }
        Err(err) => return Err(err),
    };

    let det = ldlt.det()?;
    assert!((det - 1.0).abs() <= 1e-12);

    Ok(())
}
```

> ⚠️ **LDLT invariant:** The input matrix must be **symmetric**. Asymmetric
> inputs passed to
> [`Matrix::ldlt`]https://docs.rs/la-stack/latest/la_stack/struct.Matrix.html#method.ldlt
> return a typed `LaError::Asymmetric` before factorization starts. Use
> [`Matrix::first_asymmetry`]https://docs.rs/la-stack/latest/la_stack/struct.Matrix.html#method.first_asymmetry
> to locate the offending pair, or fall back to `lu()` if your matrices may not
> be symmetric at all. Symmetric inputs with a negative LDLT diagonal return
> `LaError::NotPositiveSemidefinite`; zero or too-small non-negative diagonals
> return `LaError::Singular`.

## ⚑ Compile-time determinants (D ≀ 4)

`det_direct()` is a `const fn` providing closed-form determinants for D=0–4,
using fused multiply-add where applicable. `Matrix::<0>::zero().det_direct()`
returns `Ok(Some(1.0))` (the empty-product convention). For D=1–4, cofactor
expansion bypasses LU factorization entirely. This enables compile-time
evaluation when inputs are known:

```rust
use la_stack::prelude::*;

// Evaluated entirely at compile time β€” no runtime cost.
const DET: Result<Option<f64>, LaError> = match Matrix::<4>::try_from_rows([
    [2.0, 0.0, 0.0, 0.0],
    [0.0, 3.0, 0.0, 0.0],
    [0.0, 0.0, 5.0, 0.0],
    [0.0, 0.0, 0.0, 7.0],
]) {
    Ok(matrix) => matrix.det_direct(),
    Err(err) => Err(err),
};

fn main() -> Result<(), LaError> {
    assert_eq!(DET?, Some(210.0));
    Ok(())
}
```

The public `det()` method automatically dispatches through the closed-form path
for D ≀ 4 and falls back to LU for D β‰₯ 5. Finite inputs return a floating-point
determinant estimate in every dimension; `det()` does not surface
`LaError::Singular`. Tiny nonzero determinants are not flattened by a pivot
tolerance. Use `lu()` directly when you need tolerance-aware singularity
detection or the pivot-column diagnostic from the factorization, and use the
exact determinant APIs when exact singularity classification matters.

## πŸ”¬ Exact arithmetic (`"exact"` feature)

The default build has **zero runtime dependencies**. Enable the optional
`exact` Cargo feature to add exact arithmetic methods using arbitrary-precision
rationals (this pulls in `num-bigint`, `num-rational`, and `num-traits` for
`BigRational`):

```toml
[dependencies]
la-stack = { version = "0.4.3", features = ["exact"] }
```

**Determinants:**

- **`det_exact()`** β€” returns the exact determinant as a `BigRational`
- **`det_exact_f64()`** β€” returns the exact determinant as `f64` only when
  it is exactly representable (or `LaError::Unrepresentable` otherwise)
- **`det_exact_rounded_f64()`** β€” returns the exact determinant rounded to a
  finite `f64`
- **`det_sign_exact()`** β€” returns the provably correct sign (βˆ’1, 0, or +1)

**Linear system solve:**

- **`solve_exact(b)`** β€” solves `Ax = b` exactly, returning `[BigRational; D]`
- **`solve_exact_f64(b)`** β€” solves `Ax = b` exactly, returning `Vector<D>` only when
  every component is exactly representable as `f64`
- **`solve_exact_rounded_f64(b)`** β€” solves `Ax = b` exactly, returning each
  component rounded to finite `f64`

```rust,ignore
use la_stack::prelude::*;

fn main() -> Result<(), LaError> {
    // Exact determinant
    let m = Matrix::<3>::try_from_rows([
        [1.0, 2.0, 3.0],
        [4.0, 5.0, 6.0],
        [7.0, 8.0, 9.0],
    ])?;
    assert_eq!(m.det_sign_exact()?, 0); // exactly singular

    let det = m.det_exact()?;
    assert_eq!(det, BigRational::from_integer(0.into())); // exact zero
    let det_f64 = m.det_exact_f64()?;
    assert_eq!(det_f64, 0.0);

    // If strict exact-to-f64 conversion would require rounding, opt in
    // explicitly with the rounded API.
    let inexact = Matrix::<2>::try_from_rows([
        [1.0 + f64::EPSILON, 0.0],
        [0.0, 1.0 - f64::EPSILON],
    ])?;
    let rounded_det = match inexact.det_exact_f64() {
        Ok(det) => det,
        Err(err) if err.requires_rounding() => inexact.det_exact_rounded_f64()?,
        Err(err) => return Err(err),
    };
    assert_eq!(rounded_det.to_bits(), 1.0f64.to_bits());

    // If the exact determinant cannot fit in f64, keep the BigRational value.
    let big = f64::MAX / 2.0;
    let huge = Matrix::<3>::try_from_rows([
        [0.0, 0.0, 1.0],
        [big, 0.0, 1.0],
        [0.0, big, 1.0],
    ])?;
    let huge_det = huge.det_exact()?;
    assert_eq!(
        huge.det_exact_f64()
            .err()
            .and_then(|err| err.unrepresentable_reason()),
        Some(UnrepresentableReason::NotFinite)
    );
    println!("exact determinant = {huge_det}");

    // Exact linear system solve
    let a = Matrix::<2>::try_from_rows([[1.0, 2.0], [3.0, 4.0]])?;
    let b = Vector::<2>::try_new([5.0, 11.0])?;
    let x = a.solve_exact_f64(b)?.into_array();
    assert!((x[0] - 1.0).abs() <= f64::EPSILON);
    assert!((x[1] - 2.0).abs() <= f64::EPSILON);

    Ok(())
}
```

With the `exact` feature enabled, `BigInt` and `BigRational` are re-exported
from the crate root and prelude, alongside the most commonly needed
`num-traits` items (`FromPrimitive`, `ToPrimitive`, `Signed`). This lets
consumers construct exact values (`BigRational::from_f64`, `from_i64`), query
sign (`is_positive` / `is_negative`), and convert back to `f64` (`to_f64`)
with a single `use la_stack::prelude::*;` β€” no need to add `num-bigint`,
`num-rational`, or `num-traits` to their own `Cargo.toml`.

For `det_sign_exact()`, D ≀ 4 matrices use a fast f64 filter (error-bounded
`det_direct()`) that resolves the sign without allocating. Only near-degenerate
or large (D β‰₯ 5) matrices fall through to the exact Bareiss algorithm.

### Adaptive precision with `det_errbound()`

`det_errbound()` returns the conservative absolute error bound used by the fast
filter. This method does NOT require the `exact` feature β€” it uses pure f64
arithmetic and is available by default. This enables building custom
adaptive-precision logic for geometric predicates:

```rust,ignore
use la_stack::prelude::*;

fn main() -> Result<(), LaError> {
    let m = Matrix::<3>::identity();
    if let Some(bound) = m.det_errbound()? {
        if let Some(det) = m.det_direct()? {
            if det.abs() > bound {
                // f64 sign is guaranteed correct
                let sign = det.signum() as i8;
            } else {
                // Fall back to exact arithmetic (requires `exact` feature)
                let sign = m.det_sign_exact()?;
            }
        }
    } else {
        // D β‰₯ 5: no fast filter, use exact directly (requires `exact` feature)
        let sign = m.det_sign_exact()?;
    }

    Ok(())
}
```

The error coefficients (`ERR_COEFF_2`, `ERR_COEFF_3`, `ERR_COEFF_4`) are the
dimension-specific constants behind that bound. In plain terms, they answer:
"how many machine-epsilon-sized rounding mistakes can this closed-form
determinant formula accumulate?" To get an absolute error bound, `det_errbound()`
multiplies the coefficient by a size measure of the matrix entries, the
**absolute Leibniz sum**:

```text
p(|A|) = sum over determinant terms of product of absolute values
```

For a 2Γ—2 matrix `[[a, b], [c, d]]`, that scale is `|a*d| + |b*c|`, so:

```text
|det_direct(A) - det_exact(A)| <= ERR_COEFF_2 * (|a*d| + |b*c|)
```

The coefficients are not tolerances and are not meant to be tuned by callers;
they are conservative constants derived from the fixed D ≀ 4 formulas and their
floating-point rounding chains. They are exposed for advanced users who want to
compose the same bound themselves.

## 🧩 API at a glance

| Type | Storage | Purpose | Key methods |
|---|---|---|---|
| `Vector<D>` | `[f64; D]` | Finite fixed-length vector for input and computation | `try_new`, `zero`, `dot`, `norm2_sq` |
| `Matrix<D>` | `[[f64; D]; D]` | Finite square matrix for input and computation | See below |
| `Lu<D>` | `Matrix<D>` + pivot array | Factorization for solves/det | `solve`, `det` |
| `Ldlt<D>` | `Matrix<D>` | Factorization for symmetric SPD/PSD solves/det | `solve`, `det` |

Storage shown above reflects the intentional `f64` scalar model.

`Matrix<D>` key methods: `lu`, `ldlt`, `det`, `det_direct`, `det_errbound`,
`det_exact`ΒΉ, `det_exact_f64`ΒΉ, `det_exact_rounded_f64`ΒΉ, `det_sign_exact`ΒΉ,
`solve_exact`ΒΉ, `solve_exact_f64`ΒΉ, `solve_exact_rounded_f64`ΒΉ.
Matrix and vector constructors validate non-finite inputs at public API
boundaries. After construction, `Matrix<D>` and `Vector<D>` carry that
finite-storage invariant directly, so kernels do not revalidate stored entries.

ΒΉ Requires `features = ["exact"]`.

## πŸ“Š Benchmarks (vs nalgebra/faer)

![LU solve (factor + solve): median time vs dimension](https://raw.githubusercontent.com/acgetchell/la-stack/v0.4.3/docs/assets/bench/vs_linalg_lu_solve_median.svg)

Raw data:
[docs/assets/bench/vs_linalg_lu_solve_median.csv](https://github.com/acgetchell/la-stack/blob/v0.4.3/docs/assets/bench/vs_linalg_lu_solve_median.csv)

Representative benchmark: `lu_solve` factors the matrix and solves one
right-hand side. Median time is lower-is-better, and the β€œla-stack vs
nalgebra/faer” columns show the % time reduction relative to each baseline
(positive = la-stack faster). This is not an aggregate score across all
operations.

For the full per-kernel comparison methodology, input construction, and
release-comparison workflow details, see
[docs/BENCHMARKING.md](https://github.com/acgetchell/la-stack/blob/v0.4.3/docs/BENCHMARKING.md).
For the current release-to-release performance snapshot, see
[docs/PERFORMANCE.md](https://github.com/acgetchell/la-stack/blob/v0.4.3/docs/PERFORMANCE.md).

<!-- BENCH_TABLE:lu_solve:median:new:BEGIN -->

| D | la-stack median (ns) | nalgebra median (ns) | faer median (ns) | la-stack vs nalgebra | la-stack vs faer |
|---:|--------------------:|--------------------:|----------------:|---------------------:|----------------:|
| 2 | 2.044 | 4.542 | 143.958 | +55.0% | +98.6% |
| 3 | 9.596 | 23.599 | 185.466 | +59.3% | +94.8% |
| 4 | 23.338 | 50.717 | 210.976 | +54.0% | +88.9% |
| 5 | 45.368 | 69.065 | 277.564 | +34.3% | +83.7% |
| 8 | 127.861 | 164.412 | 364.864 | +22.2% | +65.0% |
| 16 | 631.997 | 663.822 | 882.674 | +4.8% | +28.4% |
| 32 | 2,745.604 | 2,424.540 | 2,867.431 | -13.2% | +4.2% |
| 64 | 17,543.034 | 14,747.731 | 12,266.271 | -19.0% | -43.0% |

<!-- BENCH_TABLE:lu_solve:median:new:END -->

## πŸ“‹ Examples

The `examples/` directory contains small, runnable programs:

- **`solve_5x5`** β€” solve a 5Γ—5 system via LU with partial pivoting
- **`det_5x5`** β€” determinant of a 5Γ—5 matrix via LU
- **`ldlt_solve_3x3`** β€” solve a 3Γ—3 symmetric positive definite system via LDLT
- **`const_det_4x4`** β€” compile-time 4Γ—4 determinant via `det_direct()`
- **`exact_det_3x3`** β€” exact determinant value of a near-singular 3Γ—3 matrix (requires `exact` feature)
- **`exact_sign_3x3`** β€” exact determinant sign of a near-singular 3Γ—3 matrix (requires `exact` feature)
- **`exact_solve_3x3`** β€” exact solve of a near-singular 3Γ—3 system vs f64 LU (requires `exact` feature)

```bash
just examples
# or individually:
cargo run --example solve_5x5
cargo run --example det_5x5
cargo run --example ldlt_solve_3x3
cargo run --example const_det_4x4
cargo run --features exact --example exact_det_3x3
cargo run --features exact --example exact_sign_3x3
cargo run --features exact --example exact_solve_3x3
```

## 🀝 Contributing

A short contributor workflow:

```bash
cargo install just
just setup        # install/verify dev tools + sync Python deps
just check        # lint/validate (non-mutating)
just fix          # apply auto-fixes (mutating)
just ci           # lint + tests + examples + bench compile
```

The repository uses Rust-native tooling for documentation and config checks:
`rumdl` for Markdown, `dprint` with `pretty_yaml` for YAML, `taplo` for TOML,
and `typos` for spelling. GitHub Actions references are SHA-pinned, restricted
to an explicit allowlist, and kept with readable version comments for review.

CI runs `just ci` on Ubuntu, macOS, and Windows to keep platform coverage
aligned with the local comprehensive validation path.

For coverage commands and report locations, see
[`docs/COVERAGE.md`](https://github.com/acgetchell/la-stack/blob/v0.4.3/docs/COVERAGE.md).
For the full contributor workflow, see
[CONTRIBUTING.md](https://github.com/acgetchell/la-stack/blob/v0.4.3/CONTRIBUTING.md).

## πŸ“ Citation

If you use this library in academic work, please cite it using
[CITATION.cff](https://github.com/acgetchell/la-stack/blob/v0.4.3/CITATION.cff)
(or GitHub's "Cite this repository" feature). Tagged releases are archived on
Zenodo.

## πŸ“š References

For canonical references to the algorithms used by this crate, see
[REFERENCES.md](https://github.com/acgetchell/la-stack/blob/v0.4.3/REFERENCES.md).

## πŸ€– AI Agents

AI coding assistants should read
[AGENTS.md](https://github.com/acgetchell/la-stack/blob/v0.4.3/AGENTS.md)
before proposing or applying changes. See
[CONTRIBUTING.md](https://github.com/acgetchell/la-stack/blob/v0.4.3/CONTRIBUTING.md)
for the repository's AI-assisted development note.

## πŸ“„ License

BSD 3-Clause License. See [LICENSE](./LICENSE).