feral 0.11.1

Sparse symmetric indefinite direct solver in pure Rust, with certified inertia counts.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
# FERAL

![FERAL banner](https://raw.githubusercontent.com/jkitchin/feral/main/assets/feral-banner.png)

[![CI](https://github.com/jkitchin/feral/actions/workflows/ci.yml/badge.svg)](https://github.com/jkitchin/feral/actions/workflows/ci.yml)
[![crates.io](https://img.shields.io/crates/v/feral.svg)](https://crates.io/crates/feral)
[![PyPI](https://img.shields.io/pypi/v/feral-solver.svg)](https://pypi.org/project/feral-solver/)
[![DOI](https://img.shields.io/badge/DOI-10.5281%2Fzenodo.20162687-blue.svg)](https://doi.org/10.5281/zenodo.20162687)

Feral is a pure-Rust sparse symmetric indefinite linear solver with certified inertia counts for use in interior-point optimization algorithms.

The name is a pun. **Fe** is iron's chemical symbol. Iron rusts. Rust is the language this is written in. A *feral* animal is one that was domesticated but now lives in the wild. 

## Status

Feral is production-ready for its target use: sparse symmetric-indefinite
factorization with **certified inertia counts** for interior-point
optimization. It is published on
[crates.io](https://crates.io/crates/feral) and
[PyPI](https://pypi.org/project/feral-solver/), runs end-to-end on the
**full ~183k-matrix KKT corpus** with no n-size filter (matrices up to
n ≈ 5×10⁵, ~14k above n = 500 and ~10k above n = 1000; residuals at
machine precision on the well-conditioned majority), and passes
**705 tests, 0 failed** (lib + integration, 20 ignored).

Inertia correctness is validated against a multi-source consensus oracle —
feral + canonical Fortran **MUMPS 5.8.2** + **SPRAL/SSIDS** — via
`external_benchmarks/consensus/compute_consensus.py`, which votes across the
three solvers to classify each matrix as Definitive, Borderline, Numerically
Intractable, or Excluded. On Definitive matrices inertia must be exactly
correct, with no tolerance.

It remains 0.x: minor, additive API changes may land before 1.0, but the
core factor / solve / inertia surface is stable.

### Capabilities

- **MC64 symmetric scaling** (`ScalingStrategy::Mc64Symmetric`) with an
  `Auto` strategy that engages MC64 only when its predicates fire.
- **LDLᵀ-aware ordering** (Duff–Pralet symmetric matching + quotient-graph
  compression, a port of MUMPS `ICNTL(12) = 2`) with an `Auto` default that
  resolves to `LdltCompress` only on arrow-KKT-shaped inputs.
- **SSIDS-style delayed pivoting** in the sparse path (rejected pivots
  carried forward to the parent front), plus a **rook-rescue fallback** for
  pivots rejected by the column-relative threshold test.
- **SSIDS-style column renumbering** (`AmalgamationStrategy::Renumber`,
  default-on) — cuts factor time 30–67% on IPM-KKT tail matrices
  (ACOPR30 / CRESC100 / LAKES / NELSON / SWOPF) at ~10% cost on the
  small-CUTEst-Hessian median.
- **Unsymmetric LU basis engine** (`src/lu/`) for general square
  `A = P L U Q` solves with product-form updates, alongside the symmetric
  LDLᵀ path.

### Reference-solver positioning

Per [`dev/research/reference-solver-comparison.md`](dev/research/reference-solver-comparison.md):
on the archetype tail slice, FERAL matches or beats SPRAL/SSIDS on every
matrix where both ran (BATCH 0.14×, HAHN1 0.13×, HAIFAM_0082 0.47×,
ACOPR30_0067 1.11×, CRESC100 1.22×, VESUVIO 1.41×) — and SSIDS links vendor
BLAS while FERAL does not. Versus canonical MUMPS, FERAL matches on most
matrices (BATCH 0.84×, HAIFAM_0000 1.33×) and trails by 5–8× on a
tiny-IPM-KKT class where SSIDS itself trails MUMPS by 4–8×; that gap is
acknowledged and tracked (see "Known limitations" below).

CI runs the same `pre-commit` hook set used locally, so local and CI cannot
drift. Design rationale and development history are in
[`dev/decisions.md`](dev/decisions.md) and the
[Phase 1 retrospective](dev/phase1-retrospective.org).

## What's in the box

- **Dense Bunch-Kaufman kernel** (`src/dense/factor.rs`,
  `src/dense/solve.rs`): scalar unblocked LDL^T with the classical
  `(1+√17)/8` pivot threshold, Knight-Ruiz infinity-norm equilibration,
  and iterative refinement via a best-iterate strategy.
- **Sparse multifrontal solver** (`src/symbolic/`, `src/numeric/`):
  CHOLMOD-style analysis pipeline (AMD → postorder → column counts →
  supernode amalgamation with SSIDS nemin merge rule) feeding a
  postorder multifrontal factorization that wraps the dense BK kernel.
- **Unsymmetric LU basis engine** (`src/lu/`): a general (non-symmetric)
  square `A = P L U Q` factorization with `ftran`/`btran` solves and
  simplex-style product-form column updates, for basis-matrix workloads
  that the symmetric LDLᵀ path does not cover. It auto-routes between a
  dense (`DenseLu`) and a sparse (`SparseLu`) engine via
  `should_use_dense_lu`. Re-exported at the crate root and surfaced in
  the Python bindings as `LuFactor`.
- **External benchmark oracles** (`external_benchmarks/`): native
  Fortran MUMPS 5.8.2 and SPRAL/SSIDS drivers that run on the same KKT
  corpus and produce per-matrix sidecar JSONs. The consensus framework
  (`external_benchmarks/consensus/compute_consensus.py`) votes across
  feral + MUMPS + SSIDS to classify each matrix as Definitive,
  Borderline, Numerically Intractable, or Excluded.

None of the external Fortran oracles are linked into the Rust crate.
`cargo build` works on a machine with nothing but a Rust toolchain.

## Architecture constraints

These are hard rules, recorded in `dev/decisions.md`:

1. Pure Rust on the stable toolchain.
2. Zero non-Rust dependencies in the core solver. No BLAS, no LAPACK, no
   Fortran. The Fortran MUMPS and SSIDS trees in `external_benchmarks/`
   are test infrastructure only, built manually and never linked from
   `cargo`.
3. MIT license.
4. Primary implementation from published papers and BSD-licensed
   references. Canonical references are cited in `references.bib`.
5. Inertia must be exactly correct — no tolerance on inertia counts for
   Definitive matrices in the consensus framework.
6. Correctness before performance, always.

## Building

```sh
cargo build --release
cargo test
cargo clippy -- -D warnings
```

Pre-commit hooks for `cargo fmt` and `cargo clippy` are wired up via
`.pre-commit-config.yaml`. Install once per clone:

```sh
pre-commit install
```

CI runs the exact same hooks via `pre-commit/action@v3.0.1` so local
and CI cannot drift.

## Using the solver

```rust
use feral::{factor, solve_refined, BunchKaufmanParams, SymmetricMatrix};

// Dense path
let mat = SymmetricMatrix::zeros(3);
// ... populate lower triangle with mat.set(i, j, v) ...
let (factors, inertia) = factor(&mat, &BunchKaufmanParams::default())?;
let x = solve_refined(&mat, &factors, &rhs)?;

// Sparse path
use feral::{CscMatrix, NumericParams};
use feral::symbolic::{symbolic_factorize, SupernodeParams};
use feral::numeric::{factorize::factorize_multifrontal, solve::solve_sparse_refined};

let csc = CscMatrix::from_triplets(n, &rows, &cols, &vals)?;
let sym = symbolic_factorize(&csc, &SupernodeParams::default())?;
let num_params = NumericParams::default();
let (sp_factors, sp_inertia) = factorize_multifrontal(&csc, &sym, &num_params)?;
let sp_x = solve_sparse_refined(&csc, &sp_factors, &rhs)?;
```

Both refined solvers use a *best-iterate* iterative refinement strategy:
on rank-deficient matrices where `ZeroPivotAction::ForceAccept` produced
a wrong `A⁻¹`, the refinement guarantees the returned `x` is no worse
than the unrefined solve, even when individual refinement steps would
have amplified the error.

## Python bindings

The `feral-solver` package on PyPI provides Python bindings built with
[maturin](https://www.maturin.rs/) + [pyo3](https://pyo3.rs/). Wheels
are published for CPython 3.10+ on Linux x86_64/aarch64, macOS
universal2, and Windows x86_64 — no Rust toolchain required for users.

```bash
pip install feral-solver           # plain
pip install 'feral-solver[scipy]'  # scipy.sparse adapters
pip install 'feral-solver[jax]'    # JAX interop
```

Quickstart:

```python
import numpy as np
import feral

A = feral.CscMatrix.from_dense(np.array([
    [4.0, 1.0, 0.0],
    [1.0, 3.0, 2.0],
    [0.0, 2.0, 5.0],
]))

solver = feral.Solver()
status, inertia = solver.factor(A)
assert status == feral.FactorStatus.SUCCESS
x = solver.solve(np.array([1.0, 2.0, 3.0]))
```

For interior-point KKT solves, `feral.ipm.KktSolver` wraps
`feral.Solver` with the Wächter–Biegler 2006 §3.1 perturbation-
escalation loop; symbolic analysis is cached across an entire Newton
run. `feral.from_scipy(...)` / `feral.to_scipy(...)` round-trip with
`scipy.sparse` matrices.

As of 0.11.0 the bindings also expose (all additive — existing code is
unaffected):

- **The unsymmetric LU basis engine** as `feral.LuFactor``ftran`
  (`A x = b`) / `btran` (`Aᵀ y = c`), product-form `update`s, and the
  `P A Q = L U` factors, auto-routing dense/sparse like the Rust core.
- **Factor access**`Solver.factors()` returns the assembled unit-
  lower `L` (CSC, optionally as `scipy.sparse`) and block-diagonal `D`,
  plus `perm`/`scaling` for the `L D Lᵀ = P (S A S) Pᵀ` reconstruction.
- **Symbolic analysis**`feral.analyze(A, ordering=...)` (no numeric
  factorization) and `Solver.symbolic()` return the resolved ordering,
  permutation, elimination tree, supernode count, and nnz estimate.
- **Introspection + tuning knobs** — new `Solver(...)` keyword args
  (`ordering`, `profiling`, `mc64_cache`, …), pivot-magnitude and MC64
  counters, `last_factor_stats()`, `profile_report()`, and
  `scaling_info`.

See [`python/README.md`](python/README.md) for the full API and IPM
usage, and `python/examples/notebooks/` (notebooks `01`–`05`, with
`05` walking through the LU engine, factor access, and introspection)
plus `python/examples/` for an end-to-end Newton step.

Build the bindings from source:

```bash
cd python
pip install maturin
maturin develop --release
pytest tests/
```

## Running the KKT benchmark

```sh
# Core bench (dense + sparse, against rmumps sidecars)
cargo run --release --bin bench

# Emit per-matrix .feral.json sidecars for the consensus framework
FERAL_EMIT_SIDECARS=1 cargo run --release --bin bench
```

The bench reads `data/matrices/kkt/<problem>/<id>.mtx` + `<id>.json`
and reports inertia agreement and residual pass counts along with a
family-grouped failure analysis and dense ∩ sparse cross-comparison.
The KKT matrices are not committed — generate them via ripopt's
`collect_kkt` tool.

### Diagnostic binaries

The `bench` binary lives in the root `feral` package. The ~140
throwaway investigation probes (`diag_*`, `probe_*`, `bench_*`, …)
live in a separate non-default workspace crate, `feral-diagnostics`,
so they are excluded from the default `cargo build` / `cargo test` /
`cargo clippy`. Run one with `-p`:

```sh
cargo run -p feral-diagnostics --bin <name> [-- args...]
cargo build -p feral-diagnostics            # compile them all
```

## Using FERAL inside Ipopt

FERAL ships with everything needed to build [Ipopt
3.14](https://github.com/coin-or/Ipopt) with `linear_solver=feral`
as a selectable option. The vendored Ipopt source lives in
`ref/Ipopt/`; the integration glue (a `LinearSolverInterface`
shim against FERAL's C ABI, plus the autotools patch) lives in
`feral-ipopt-shim/`.

### Quickstart

```sh
# One-shot: builds libferal.a, patches Ipopt, configures, builds,
# and links Ipopt against the static FERAL archive.
make ipopt

# Smoke test: run the bundled hs071 sample NLP with linear_solver=feral.
make hs071
```

Under the hood `make ipopt` delegates to `feral-ipopt-shim/Makefile`,
which:

1. `cargo build --release` to produce `target/release/libferal.a`
2. Copies `IpFeralSolverInterface.{hpp,cpp}` and `feral_capi.h` into
   `ref/Ipopt/src/Algorithm/LinearSolvers/` and applies
   `patches/ipopt-feral.patch`
3. Configures Ipopt with `--disable-shared --enable-static
   --without-hsl --without-spral --without-pardiso --without-asl`
4. `make -j` in `ref/Ipopt/build-feral/`

To rebuild after editing FERAL source: `cargo build --release && make
-C ref/Ipopt/build-feral` (the relink picks up the fresh `.a`).

### Runtime env knobs

FERAL exposes its tuning options through environment variables that
the C ABI reads on `feral_new()`:

| variable              | default | effect                                                                |
|-----------------------|---------|-----------------------------------------------------------------------|
| `FERAL_CASCADE_BREAK` | off     | `on` arms the static-pivot cascade-break perturbation unconditionally |
| `FERAL_AUTO_CB_BETA`  | `0.05`  | warm cascade-break auto-arm threshold (fraction of `n`); `0` disables |
| `FERAL_SCALING`       | auto    | `auto` | `infnorm` | `mc64` | `identity`                              |
| `FERAL_PIVTOL`        | `1e-8`  | Bunch-Kaufman partial-pivot threshold                                 |
| `FERAL_PARALLEL`      | off     | `on` enables the rayon-based parallel multifrontal driver             |
| `FERAL_FACTOR_TRACE`  | off     | `on` streams per-factor wall + delayed-pivot counts to stderr         |
| `FERAL_MC64_TRACE`    | off     | `on` streams per-call MC64 wall to stderr                             |

The defaults are the ones validated in the v0.4.0 Mittelmann sweep
(see `CHANGELOG.md`).

### Mittelmann NLP benchmark

`external_benchmarks/mittelmann_ipopt/` runs Ipopt with both MA57 and
FERAL on the 47-problem [Mittelmann NLP
panel](https://plato.asu.edu/ftp/ampl-nlp.html). The harness, the
aggregator, and the per-problem rescue dictionary are committed; the
`.nl` problem files are **not** (~1.5 GiB total, single file up to
~290 MiB).

To reproduce the benchmark:

1. Fetch the AMPL `.nl` files from Mittelmann's public archive
   (https://plato.asu.edu/ftp/ampl-nlp.html — the problem list is in
   `external_benchmarks/mittelmann_ipopt/run.py::PROBLEMS`).
2. Edit `NL_DIR` and `PROBLEMS` at the top of `run.py` to point at
   your local checkout.
3. Build an Ipopt binary that has both MA57 (HSL) and FERAL linked
   in. The shim Makefile above produces a FERAL-only Ipopt; for the
   dual-solver comparison binary you also need a licensed HSL source
   tree and an Ipopt configure step that links `libcoinhsl`.
4. `python run.py --solvers feral,ma57 --timeout 600 && python
   aggregate.py` produces `REPORT.md` (gitignored, regenerates from
   `results/{ma57,feral}.jsonl`).

See `external_benchmarks/mittelmann_ipopt/README.md` for the per-
problem rescue table and finer-grained invocation modes.

## Running the multi-oracle consensus

Requires `gfortran`, `OpenBLAS`, `METIS`, and the `ref/mumps` and
`ref/spral` source trees.

```sh
# Build the Fortran oracles (one-time)
make -C external_benchmarks/mumps_oracle all
make -C external_benchmarks/ssids_oracle all

# Run them over the corpus (writes .mumps.json and .ssids.json sidecars)
python3 external_benchmarks/mumps_oracle/run_mumps.py data/matrices/kkt --skip-existing
python3 external_benchmarks/ssids_oracle/run_ssids.py data/matrices/kkt --skip-existing

# Emit feral sidecars
FERAL_EMIT_SIDECARS=1 cargo run --release --bin bench

# Compute verdicts
python3 external_benchmarks/consensus/compute_consensus.py data/matrices/kkt
```

See [`dev/plans/phase-1b-consensus-exit.md`](dev/plans/phase-1b-consensus-exit.md)
for the architecture of the consensus framework and
[`dev/phase1-retrospective.org`](dev/phase1-retrospective.org) for the
story of how and why it was built.

## Known limitations

- **Three-way oracle disagreement (ACOPP30_0005).** Feral, MUMPS, and
  SSIDS each report a different inertia on this borderline KKT
  matrix; it is `#[ignore]`'d in `tests/parity.rs` and excluded by
  the consensus framework. The other historically-failing rank-
  deficient panel matrices (FBRAIN3LS_0839, ACOPP14×2, ACOPP30×2,
  CERI651CLS) all now pass the oracle-consensus gate per the
  CLAUDE.md correctness contract. FBRAIN3LS_0839 closed 2026-05-17
  via the F-01 sign-fallback fix
  ([#39]https://github.com/jkitchin/feral/issues/39; see
  [`dev/research/f01-rankdef-underreporting.md`]dev/research/f01-rankdef-underreporting.md
  2026-05-17 addendum).
- **Tiny residual gap on a few panel matrices.** CERI651CLS_0487 and
  three SSI matrices (`SSI_1685`, `SSI_2412`, `SSI_2597`) produce
  feral residuals 1.6×–1600× larger than MUMPS — all still tiny in
  absolute terms (~1e-8 to ~1e-13) but outside the K=10 residual
  gate. Inertia is correct in every case. These are `#[ignore]`'d
  in `tests/parity.rs`.
- **Tiny-IPM-KKT factor-time gap vs MUMPS.** On a class of small KKT
  matrices (BATCH, HAHN1, ACOPR30 at n ≈ 100–600), canonical MUMPS is
  5–8× faster than FERAL — a gap SPRAL/SSIDS also pays. The proper
  investigation is queued (measure MUMPS amalgamation / front-size
  distribution; bucket FERAL `factor_us` by front size; compare front
  counts) but not currently scheduled. See
  [`dev/research/reference-solver-comparison.md`]dev/research/reference-solver-comparison.md.
- **Dense kernel has no delayed pivoting.** The sparse multifrontal
  path implements SSIDS-style delayed pivoting on non-root supernodes
  (`may_delay = true`, `n_delayed_in/out` plumbed through
  `factorize_multifrontal`). The standalone dense `factor` entry
  point still falls back to `ZeroPivotAction::ForceAccept` on
  rank-deficient blocks; the sparse-path root supernode runs under
  `may_delay = false` for the same reason.

## References

The bibliography in `references.bib` is cited throughout the retrospective
and code. Canonical references:

- Bunch & Kaufman 1977 (BK pivoting)
- Duff & Reid 1983 (multifrontal method)
- Amestoy, Davis & Duff 1996 (approximate minimum degree)
- George & Liu 1981 (elimination trees)
- Davis 2006 (CHOLMOD, direct methods textbook)
- Hogg, Reid & Scott 2010 (SSIDS)
- Wächter & Biegler 2006 (IPOPT interior-point method)

## License

FERAL is MIT-licensed. See [`LICENSE`](LICENSE).

Third-party attributions for code incorporated from other open-source
projects (notably the SPRAL Hungarian/MC64 scaling routines used in
`src/scaling/`) are recorded in [`NOTICE`](NOTICE), with the full
upstream license texts in [`LICENSE-THIRD-PARTY`](LICENSE-THIRD-PARTY).