FERAL

Feral is a pure-Rust sparse symmetric indefinite linear solver with certified inertia counts for use in interior-point optimization algorithms.
The name is a pun. Fe is iron's chemical symbol. Iron rusts. Rust is the language this is written in. A feral animal is one that was domesticated but now lives in the wild.
Status
Feral is production-ready for its target use: sparse symmetric-indefinite factorization with certified inertia counts for interior-point optimization. It is published on crates.io and PyPI, runs end-to-end on the full ~183k-matrix KKT corpus with no n-size filter (matrices up to n ≈ 5×10⁵, ~14k above n = 500 and ~10k above n = 1000; residuals at machine precision on the well-conditioned majority), and passes 705 tests, 0 failed (lib + integration, 20 ignored).
Inertia correctness is validated against a multi-source consensus oracle —
feral + canonical Fortran MUMPS 5.8.2 + SPRAL/SSIDS — via
external_benchmarks/consensus/compute_consensus.py, which votes across the
three solvers to classify each matrix as Definitive, Borderline, Numerically
Intractable, or Excluded. On Definitive matrices inertia must be exactly
correct, with no tolerance.
It remains 0.x: minor, additive API changes may land before 1.0, but the core factor / solve / inertia surface is stable.
Capabilities
- MC64 symmetric scaling (
ScalingStrategy::Mc64Symmetric) with anAutostrategy that engages MC64 only when its predicates fire. - LDLᵀ-aware ordering (Duff–Pralet symmetric matching + quotient-graph
compression, a port of MUMPS
ICNTL(12) = 2) with anAutodefault that resolves toLdltCompressonly on arrow-KKT-shaped inputs. - SSIDS-style delayed pivoting in the sparse path (rejected pivots carried forward to the parent front), plus a rook-rescue fallback for pivots rejected by the column-relative threshold test.
- SSIDS-style column renumbering (
AmalgamationStrategy::Renumber, default-on) — cuts factor time 30–67% on IPM-KKT tail matrices (ACOPR30 / CRESC100 / LAKES / NELSON / SWOPF) at ~10% cost on the small-CUTEst-Hessian median. - Unsymmetric LU basis engine (
src/lu/) for general squareA = P L U Qsolves with product-form updates, alongside the symmetric LDLᵀ path.
Reference-solver positioning
Per dev/research/reference-solver-comparison.md:
on the archetype tail slice, FERAL matches or beats SPRAL/SSIDS on every
matrix where both ran (BATCH 0.14×, HAHN1 0.13×, HAIFAM_0082 0.47×,
ACOPR30_0067 1.11×, CRESC100 1.22×, VESUVIO 1.41×) — and SSIDS links vendor
BLAS while FERAL does not. Versus canonical MUMPS, FERAL matches on most
matrices (BATCH 0.84×, HAIFAM_0000 1.33×) and trails by 5–8× on a
tiny-IPM-KKT class where SSIDS itself trails MUMPS by 4–8×; that gap is
acknowledged and tracked (see "Known limitations" below).
CI runs the same pre-commit hook set used locally, so local and CI cannot
drift. Design rationale and development history are in
dev/decisions.md and the
Phase 1 retrospective.
What's in the box
- Dense Bunch-Kaufman kernel (
src/dense/factor.rs,src/dense/solve.rs): scalar unblocked LDL^T with the classical(1+√17)/8pivot threshold, Knight-Ruiz infinity-norm equilibration, and iterative refinement via a best-iterate strategy. - Sparse multifrontal solver (
src/symbolic/,src/numeric/): CHOLMOD-style analysis pipeline (AMD → postorder → column counts → supernode amalgamation with SSIDS nemin merge rule) feeding a postorder multifrontal factorization that wraps the dense BK kernel. - Unsymmetric LU basis engine (
src/lu/): a general (non-symmetric) squareA = P L U Qfactorization withftran/btransolves and simplex-style product-form column updates, for basis-matrix workloads that the symmetric LDLᵀ path does not cover. It auto-routes between a dense (DenseLu) and a sparse (SparseLu) engine viashould_use_dense_lu. Re-exported at the crate root and surfaced in the Python bindings asLuFactor. - External benchmark oracles (
external_benchmarks/): native Fortran MUMPS 5.8.2 and SPRAL/SSIDS drivers that run on the same KKT corpus and produce per-matrix sidecar JSONs. The consensus framework (external_benchmarks/consensus/compute_consensus.py) votes across feral + MUMPS + SSIDS to classify each matrix as Definitive, Borderline, Numerically Intractable, or Excluded.
None of the external Fortran oracles are linked into the Rust crate.
cargo build works on a machine with nothing but a Rust toolchain.
Architecture constraints
These are hard rules, recorded in dev/decisions.md:
- Pure Rust on the stable toolchain.
- Zero non-Rust dependencies in the core solver. No BLAS, no LAPACK, no
Fortran. The Fortran MUMPS and SSIDS trees in
external_benchmarks/are test infrastructure only, built manually and never linked fromcargo. - MIT license.
- Primary implementation from published papers and BSD-licensed
references. Canonical references are cited in
references.bib. - Inertia must be exactly correct — no tolerance on inertia counts for Definitive matrices in the consensus framework.
- Correctness before performance, always.
Building
Pre-commit hooks for cargo fmt and cargo clippy are wired up via
.pre-commit-config.yaml. Install once per clone:
CI runs the exact same hooks via pre-commit/action@v3.0.1 so local
and CI cannot drift.
Using the solver
use ;
// Dense path
let mat = zeros;
// ... populate lower triangle with mat.set(i, j, v) ...
let = factor?;
let x = solve_refined?;
// Sparse path
use ;
use ;
use ;
let csc = from_triplets?;
let sym = symbolic_factorize?;
let num_params = default;
let = factorize_multifrontal?;
let sp_x = solve_sparse_refined?;
Both refined solvers use a best-iterate iterative refinement strategy:
on rank-deficient matrices where ZeroPivotAction::ForceAccept produced
a wrong A⁻¹, the refinement guarantees the returned x is no worse
than the unrefined solve, even when individual refinement steps would
have amplified the error.
Python bindings
The feral-solver package on PyPI provides Python bindings built with
maturin + pyo3. Wheels
are published for CPython 3.10+ on Linux x86_64/aarch64, macOS
universal2, and Windows x86_64 — no Rust toolchain required for users.
Quickstart:
=
=
, =
assert ==
=
For interior-point KKT solves, feral.ipm.KktSolver wraps
feral.Solver with the Wächter–Biegler 2006 §3.1 perturbation-
escalation loop; symbolic analysis is cached across an entire Newton
run. feral.from_scipy(...) / feral.to_scipy(...) round-trip with
scipy.sparse matrices.
As of 0.11.0 the bindings also expose (all additive — existing code is unaffected):
- The unsymmetric LU basis engine as
feral.LuFactor—ftran(A x = b) /btran(Aᵀ y = c), product-formupdates, and theP A Q = L Ufactors, auto-routing dense/sparse like the Rust core. - Factor access —
Solver.factors()returns the assembled unit- lowerL(CSC, optionally asscipy.sparse) and block-diagonalD, plusperm/scalingfor theL D Lᵀ = P (S A S) Pᵀreconstruction. - Symbolic analysis —
feral.analyze(A, ordering=...)(no numeric factorization) andSolver.symbolic()return the resolved ordering, permutation, elimination tree, supernode count, and nnz estimate. - Introspection + tuning knobs — new
Solver(...)keyword args (ordering,profiling,mc64_cache, …), pivot-magnitude and MC64 counters,last_factor_stats(),profile_report(), andscaling_info.
See python/README.md for the full API and IPM
usage, and python/examples/notebooks/ (notebooks 01–05, with
05 walking through the LU engine, factor access, and introspection)
plus python/examples/ for an end-to-end Newton step.
Build the bindings from source:
Running the KKT benchmark
# Core bench (dense + sparse, against rmumps sidecars)
# Emit per-matrix .feral.json sidecars for the consensus framework
FERAL_EMIT_SIDECARS=1
The bench reads data/matrices/kkt/<problem>/<id>.mtx + <id>.json
and reports inertia agreement and residual pass counts along with a
family-grouped failure analysis and dense ∩ sparse cross-comparison.
The KKT matrices are not committed — generate them via ripopt's
collect_kkt tool.
Diagnostic binaries
The bench binary lives in the root feral package. The ~140
throwaway investigation probes (diag_*, probe_*, bench_*, …)
live in a separate non-default workspace crate, feral-diagnostics,
so they are excluded from the default cargo build / cargo test /
cargo clippy. Run one with -p:
Using FERAL inside Ipopt
FERAL ships with everything needed to build Ipopt
3.14 with linear_solver=feral
as a selectable option. The vendored Ipopt source lives in
ref/Ipopt/; the integration glue (a LinearSolverInterface
shim against FERAL's C ABI, plus the autotools patch) lives in
feral-ipopt-shim/.
Quickstart
# One-shot: builds libferal.a, patches Ipopt, configures, builds,
# and links Ipopt against the static FERAL archive.
# Smoke test: run the bundled hs071 sample NLP with linear_solver=feral.
Under the hood make ipopt delegates to feral-ipopt-shim/Makefile,
which:
cargo build --releaseto producetarget/release/libferal.a- Copies
IpFeralSolverInterface.{hpp,cpp}andferal_capi.hintoref/Ipopt/src/Algorithm/LinearSolvers/and appliespatches/ipopt-feral.patch - Configures Ipopt with
--disable-shared --enable-static --without-hsl --without-spral --without-pardiso --without-asl make -jinref/Ipopt/build-feral/
To rebuild after editing FERAL source: cargo build --release && make -C ref/Ipopt/build-feral (the relink picks up the fresh .a).
Runtime env knobs
FERAL exposes its tuning options through environment variables that
the C ABI reads on feral_new():
| variable | default | effect |
|---|---|---|
FERAL_CASCADE_BREAK |
off | on arms the static-pivot cascade-break perturbation unconditionally |
FERAL_AUTO_CB_BETA |
0.05 |
warm cascade-break auto-arm threshold (fraction of n); 0 disables |
FERAL_SCALING |
auto | auto |
FERAL_PIVTOL |
1e-8 |
Bunch-Kaufman partial-pivot threshold |
FERAL_PARALLEL |
off | on enables the rayon-based parallel multifrontal driver |
FERAL_FACTOR_TRACE |
off | on streams per-factor wall + delayed-pivot counts to stderr |
FERAL_MC64_TRACE |
off | on streams per-call MC64 wall to stderr |
The defaults are the ones validated in the v0.4.0 Mittelmann sweep
(see CHANGELOG.md).
Mittelmann NLP benchmark
external_benchmarks/mittelmann_ipopt/ runs Ipopt with both MA57 and
FERAL on the 47-problem Mittelmann NLP
panel. The harness, the
aggregator, and the per-problem rescue dictionary are committed; the
.nl problem files are not (~1.5 GiB total, single file up to
~290 MiB).
To reproduce the benchmark:
- Fetch the AMPL
.nlfiles from Mittelmann's public archive (https://plato.asu.edu/ftp/ampl-nlp.html — the problem list is inexternal_benchmarks/mittelmann_ipopt/run.py::PROBLEMS). - Edit
NL_DIRandPROBLEMSat the top ofrun.pyto point at your local checkout. - Build an Ipopt binary that has both MA57 (HSL) and FERAL linked
in. The shim Makefile above produces a FERAL-only Ipopt; for the
dual-solver comparison binary you also need a licensed HSL source
tree and an Ipopt configure step that links
libcoinhsl. python run.py --solvers feral,ma57 --timeout 600 && python aggregate.pyproducesREPORT.md(gitignored, regenerates fromresults/{ma57,feral}.jsonl).
See external_benchmarks/mittelmann_ipopt/README.md for the per-
problem rescue table and finer-grained invocation modes.
Running the multi-oracle consensus
Requires gfortran, OpenBLAS, METIS, and the ref/mumps and
ref/spral source trees.
# Build the Fortran oracles (one-time)
# Run them over the corpus (writes .mumps.json and .ssids.json sidecars)
# Emit feral sidecars
FERAL_EMIT_SIDECARS=1
# Compute verdicts
See dev/plans/phase-1b-consensus-exit.md
for the architecture of the consensus framework and
dev/phase1-retrospective.org for the
story of how and why it was built.
Known limitations
- Three-way oracle disagreement (ACOPP30_0005). Feral, MUMPS, and
SSIDS each report a different inertia on this borderline KKT
matrix; it is
#[ignore]'d intests/parity.rsand excluded by the consensus framework. The other historically-failing rank- deficient panel matrices (FBRAIN3LS_0839, ACOPP14×2, ACOPP30×2, CERI651CLS) all now pass the oracle-consensus gate per the CLAUDE.md correctness contract. FBRAIN3LS_0839 closed 2026-05-17 via the F-01 sign-fallback fix (#39; seedev/research/f01-rankdef-underreporting.md2026-05-17 addendum). - Tiny residual gap on a few panel matrices. CERI651CLS_0487 and
three SSI matrices (
SSI_1685,SSI_2412,SSI_2597) produce feral residuals 1.6×–1600× larger than MUMPS — all still tiny in absolute terms (~1e-8 to ~1e-13) but outside the K=10 residual gate. Inertia is correct in every case. These are#[ignore]'d intests/parity.rs. - Tiny-IPM-KKT factor-time gap vs MUMPS. On a class of small KKT
matrices (BATCH, HAHN1, ACOPR30 at n ≈ 100–600), canonical MUMPS is
5–8× faster than FERAL — a gap SPRAL/SSIDS also pays. The proper
investigation is queued (measure MUMPS amalgamation / front-size
distribution; bucket FERAL
factor_usby front size; compare front counts) but not currently scheduled. Seedev/research/reference-solver-comparison.md. - Dense kernel has no delayed pivoting. The sparse multifrontal
path implements SSIDS-style delayed pivoting on non-root supernodes
(
may_delay = true,n_delayed_in/outplumbed throughfactorize_multifrontal). The standalone densefactorentry point still falls back toZeroPivotAction::ForceAccepton rank-deficient blocks; the sparse-path root supernode runs undermay_delay = falsefor the same reason.
References
The bibliography in references.bib is cited throughout the retrospective
and code. Canonical references:
- Bunch & Kaufman 1977 (BK pivoting)
- Duff & Reid 1983 (multifrontal method)
- Amestoy, Davis & Duff 1996 (approximate minimum degree)
- George & Liu 1981 (elimination trees)
- Davis 2006 (CHOLMOD, direct methods textbook)
- Hogg, Reid & Scott 2010 (SSIDS)
- Wächter & Biegler 2006 (IPOPT interior-point method)
License
FERAL is MIT-licensed. See LICENSE.
Third-party attributions for code incorporated from other open-source
projects (notably the SPRAL Hungarian/MC64 scaling routines used in
src/scaling/) are recorded in NOTICE, with the full
upstream license texts in LICENSE-THIRD-PARTY.