feral 0.7.0

Sparse symmetric indefinite direct solver in pure Rust, with certified inertia counts.
Documentation

FERAL

FERAL banner

CI DOI

Feral is a pure-Rust sparse symmetric indefinite linear solver with certified inertia counts for use in interior-point optimization algorithms.

The name is a pun. Fe is iron's chemical symbol. Iron rusts. Rust is the language this is written in. A feral animal is one that was domesticated but now lives in the wild.

Status

Feral is research-grade, pre-1.0. Phase 1 closed via a multi-source consensus oracle (feral + canonical Fortran MUMPS 5.8.2 + SPRAL/SSIDS); Phase 2 has shipped scaling, ordering, pivoting, and amalgamation work, and is ongoing. The corpus-wide sidecar migration to MUMPS+SSIDS consensus inertia (consensus_mumps_ssids_feralsparse_2026-04-25, see dev/decisions.md) replaced the prior rmumps-only ground truth and is the basis for current validation.

The sparse multifrontal path runs end-to-end on the full ~183k-matrix KKT corpus with no n-size filter (matrices range up to n ≈ 5×10⁵, with ~14k above n = 500 and ~10k above n = 1000; residuals are at machine precision on the well-conditioned majority). Major Phase 2 capabilities now in main:

  • MC64 symmetric scaling (ScalingStrategy::Mc64Symmetric) with an Auto strategy that picks MC64 only when its predicates fire.
  • LDLᵀ-aware ordering (Duff–Pralet symmetric matching + quotient- graph compression, port of MUMPS ICNTL(12) = 2) with an Auto default that resolves to LdltCompress only on arrow-KKT-shaped inputs.
  • SSIDS-style delayed pivoting in the sparse path (may_delay = true on non-root supernodes; rejected pivots are carried forward to the parent front).
  • Rook-rescue fallback for pivots rejected by the column-relative threshold test, splicing into try_reject_1x1_frontal before delaying.
  • SSIDS-style column renumbering (AmalgamationStrategy::Renumber, default-on) — cuts factor time 30–67% on IPM-KKT tail matrices (ACOPR30 / CRESC100 / LAKES / NELSON / SWOPF) at ~10% cost on the small-CUTEst-Hessian median.

Reference-solver positioning (per dev/research/reference-solver-comparison.md, which supersedes the earlier "10× vs MUMPS" framing): on the archetype tail slice, FERAL matches or beats SPRAL/SSIDS on every matrix where both ran (BATCH 0.14×, HAHN1 0.13×, HAIFAM_0082 0.47×, ACOPR30_0067 1.11×, CRESC100 1.22×, VESUVIO 1.41×) — and SSIDS links vendor BLAS while FERAL does not. Versus canonical MUMPS, FERAL matches on most matrices (BATCH 0.84×, HAIFAM_0000 1.33×) and trails by 5–8× on a tiny-IPM-KKT class where SSIDS itself trails MUMPS by 4–8×; that gap is acknowledged and deferred, with the proper investigation queued in the reference-solver note.

The full test suite is 534 tests passing, 0 failed (lib + integration, 19 ignored); CI runs the same pre-commit hook set used locally so local and CI cannot drift.

The Phase 2 plan lives in dev/plans/phase-2-planning.md; phase-by-phase decisions are in dev/decisions.md; the Phase 1 story is in the Phase 1 retrospective.

What's in the box

  • Dense Bunch-Kaufman kernel (src/dense/factor.rs, src/dense/solve.rs): scalar unblocked LDL^T with the classical (1+√17)/8 pivot threshold, Knight-Ruiz infinity-norm equilibration, and iterative refinement via a best-iterate strategy.
  • Sparse multifrontal solver (src/symbolic/, src/numeric/): CHOLMOD-style analysis pipeline (AMD → postorder → column counts → supernode amalgamation with SSIDS nemin merge rule) feeding a postorder multifrontal factorization that wraps the dense BK kernel.
  • External benchmark oracles (external_benchmarks/): native Fortran MUMPS 5.8.2 and SPRAL/SSIDS drivers that run on the same KKT corpus and produce per-matrix sidecar JSONs. The consensus framework (external_benchmarks/consensus/compute_consensus.py) votes across feral + MUMPS + SSIDS to classify each matrix as Definitive, Borderline, Numerically Intractable, or Excluded.

None of the external Fortran oracles are linked into the Rust crate. cargo build works on a machine with nothing but a Rust toolchain.

Architecture constraints

These are hard rules, recorded in dev/decisions.md:

  1. Pure Rust on the stable toolchain.
  2. Zero non-Rust dependencies in the core solver. No BLAS, no LAPACK, no Fortran. The Fortran MUMPS and SSIDS trees in external_benchmarks/ are test infrastructure only, built manually and never linked from cargo.
  3. MIT license.
  4. Primary implementation from published papers and BSD-licensed references. Canonical references are cited in references.bib.
  5. Inertia must be exactly correct — no tolerance on inertia counts for Definitive matrices in the consensus framework.
  6. Correctness before performance, always.

Building

cargo build --release
cargo test
cargo clippy -- -D warnings

Pre-commit hooks for cargo fmt and cargo clippy are wired up via .pre-commit-config.yaml. Install once per clone:

pre-commit install

CI runs the exact same hooks via pre-commit/action@v3.0.1 so local and CI cannot drift.

Using the solver

use feral::{factor, solve_refined, BunchKaufmanParams, SymmetricMatrix};

// Dense path
let mat = SymmetricMatrix::zeros(3);
// ... populate lower triangle with mat.set(i, j, v) ...
let (factors, inertia) = factor(&mat, &BunchKaufmanParams::default())?;
let x = solve_refined(&mat, &factors, &rhs)?;

// Sparse path
use feral::{CscMatrix, NumericParams};
use feral::symbolic::{symbolic_factorize, SupernodeParams};
use feral::numeric::{factorize::factorize_multifrontal, solve::solve_sparse_refined};

let csc = CscMatrix::from_triplets(n, &rows, &cols, &vals)?;
let sym = symbolic_factorize(&csc, &SupernodeParams::default())?;
let num_params = NumericParams::default();
let (sp_factors, sp_inertia) = factorize_multifrontal(&csc, &sym, &num_params)?;
let sp_x = solve_sparse_refined(&csc, &sp_factors, &rhs)?;

Both refined solvers use a best-iterate iterative refinement strategy: on rank-deficient matrices where ZeroPivotAction::ForceAccept produced a wrong A⁻¹, the refinement guarantees the returned x is no worse than the unrefined solve, even when individual refinement steps would have amplified the error.

Python bindings

The feral-solver package on PyPI provides Python bindings built with maturin + pyo3. Wheels are published for CPython 3.10+ on Linux x86_64/aarch64, macOS universal2, and Windows x86_64 — no Rust toolchain required for users.

pip install feral-solver           # plain
pip install 'feral-solver[scipy]'  # scipy.sparse adapters
pip install 'feral-solver[jax]'    # JAX interop

Quickstart:

import numpy as np
import feral

A = feral.CscMatrix.from_dense(np.array([
    [4.0, 1.0, 0.0],
    [1.0, 3.0, 2.0],
    [0.0, 2.0, 5.0],
]))

solver = feral.Solver()
status, inertia = solver.factor(A)
assert status == feral.FactorStatus.SUCCESS
x = solver.solve(np.array([1.0, 2.0, 3.0]))

For interior-point KKT solves, feral.ipm.KktSolver wraps feral.Solver with the Wächter–Biegler 2006 §3.1 perturbation- escalation loop; symbolic analysis is cached across an entire Newton run. feral.from_scipy(...) / feral.to_scipy(...) round-trip with scipy.sparse matrices. See python/README.md for the full API and IPM usage, and python/examples/ for an end-to- end Newton step.

Build the bindings from source:

cd python
pip install maturin
maturin develop --release
pytest tests/

Running the KKT benchmark

# Core bench (dense + sparse, against rmumps sidecars)
cargo run --release --bin bench

# Emit per-matrix .feral.json sidecars for the consensus framework
FERAL_EMIT_SIDECARS=1 cargo run --release --bin bench

The bench reads data/matrices/kkt/<problem>/<id>.mtx + <id>.json and reports inertia agreement and residual pass counts along with a family-grouped failure analysis and dense ∩ sparse cross-comparison. The KKT matrices are not committed — generate them via ripopt's collect_kkt tool.

Using FERAL inside Ipopt

FERAL ships with everything needed to build Ipopt 3.14 with linear_solver=feral as a selectable option. The vendored Ipopt source lives in ref/Ipopt/; the integration glue (a LinearSolverInterface shim against FERAL's C ABI, plus the autotools patch) lives in feral-ipopt-shim/.

Quickstart

# One-shot: builds libferal.a, patches Ipopt, configures, builds,
# and links Ipopt against the static FERAL archive.
make ipopt

# Smoke test: run the bundled hs071 sample NLP with linear_solver=feral.
make hs071

Under the hood make ipopt delegates to feral-ipopt-shim/Makefile, which:

  1. cargo build --release to produce target/release/libferal.a
  2. Copies IpFeralSolverInterface.{hpp,cpp} and feral_capi.h into ref/Ipopt/src/Algorithm/LinearSolvers/ and applies patches/ipopt-feral.patch
  3. Configures Ipopt with --disable-shared --enable-static --without-hsl --without-spral --without-pardiso --without-asl
  4. make -j in ref/Ipopt/build-feral/

To rebuild after editing FERAL source: cargo build --release && make -C ref/Ipopt/build-feral (the relink picks up the fresh .a).

Runtime env knobs

FERAL exposes its tuning options through environment variables that the C ABI reads on feral_new():

variable default effect
FERAL_CASCADE_BREAK off on arms the static-pivot cascade-break perturbation unconditionally
FERAL_AUTO_CB_BETA 0.05 warm cascade-break auto-arm threshold (fraction of n); 0 disables
FERAL_SCALING auto auto
FERAL_PIVTOL 1e-8 Bunch-Kaufman partial-pivot threshold
FERAL_PARALLEL off on enables the rayon-based parallel multifrontal driver
FERAL_FACTOR_TRACE off on streams per-factor wall + delayed-pivot counts to stderr
FERAL_MC64_TRACE off on streams per-call MC64 wall to stderr

The defaults are the ones validated in the v0.4.0 Mittelmann sweep (see CHANGELOG.md).

Mittelmann NLP benchmark

external_benchmarks/mittelmann_ipopt/ runs Ipopt with both MA57 and FERAL on the 47-problem Mittelmann NLP panel. The harness, the aggregator, and the per-problem rescue dictionary are committed; the .nl problem files are not (~1.5 GiB total, single file up to ~290 MiB).

To reproduce the benchmark:

  1. Fetch the AMPL .nl files from Mittelmann's public archive (https://plato.asu.edu/ftp/ampl-nlp.html — the problem list is in external_benchmarks/mittelmann_ipopt/run.py::PROBLEMS).
  2. Edit NL_DIR and PROBLEMS at the top of run.py to point at your local checkout.
  3. Build an Ipopt binary that has both MA57 (HSL) and FERAL linked in. The shim Makefile above produces a FERAL-only Ipopt; for the dual-solver comparison binary you also need a licensed HSL source tree and an Ipopt configure step that links libcoinhsl.
  4. python run.py --solvers feral,ma57 --timeout 600 && python aggregate.py produces REPORT.md (gitignored, regenerates from results/{ma57,feral}.jsonl).

See external_benchmarks/mittelmann_ipopt/README.md for the per- problem rescue table and finer-grained invocation modes.

Running the multi-oracle consensus

Requires gfortran, OpenBLAS, METIS, and the ref/mumps and ref/spral source trees.

# Build the Fortran oracles (one-time)
make -C external_benchmarks/mumps_oracle all
make -C external_benchmarks/ssids_oracle all

# Run them over the corpus (writes .mumps.json and .ssids.json sidecars)
python3 external_benchmarks/mumps_oracle/run_mumps.py data/matrices/kkt --skip-existing
python3 external_benchmarks/ssids_oracle/run_ssids.py data/matrices/kkt --skip-existing

# Emit feral sidecars
FERAL_EMIT_SIDECARS=1 cargo run --release --bin bench

# Compute verdicts
python3 external_benchmarks/consensus/compute_consensus.py data/matrices/kkt

See dev/plans/phase-1b-consensus-exit.md for the architecture of the consensus framework and dev/phase1-retrospective.org for the story of how and why it was built.

Known limitations

  • Three-way oracle disagreement (ACOPP30_0005). Feral, MUMPS, and SSIDS each report a different inertia on this borderline KKT matrix; it is #[ignore]'d in tests/parity.rs and excluded by the consensus framework. The other historically-failing rank- deficient panel matrices (FBRAIN3LS_0839, ACOPP14×2, ACOPP30×2, CERI651CLS) all now pass the oracle-consensus gate per the CLAUDE.md correctness contract. FBRAIN3LS_0839 closed 2026-05-17 via the F-01 sign-fallback fix (#39; see dev/research/f01-rankdef-underreporting.md 2026-05-17 addendum).
  • Tiny residual gap on a few panel matrices. CERI651CLS_0487 and three SSI matrices (SSI_1685, SSI_2412, SSI_2597) produce feral residuals 1.6×–1600× larger than MUMPS — all still tiny in absolute terms (~1e-8 to ~1e-13) but outside the K=10 residual gate. Inertia is correct in every case. These are #[ignore]'d in tests/parity.rs.
  • Tiny-IPM-KKT factor-time gap vs MUMPS. On a class of small KKT matrices (BATCH, HAHN1, ACOPR30 at n ≈ 100–600), canonical MUMPS is 5–8× faster than FERAL — a gap SPRAL/SSIDS also pays. The proper investigation is queued (measure MUMPS amalgamation / front-size distribution; bucket FERAL factor_us by front size; compare front counts) but not currently scheduled. See dev/research/reference-solver-comparison.md.
  • Dense kernel has no delayed pivoting. The sparse multifrontal path implements SSIDS-style delayed pivoting on non-root supernodes (may_delay = true, n_delayed_in/out plumbed through factorize_multifrontal). The standalone dense factor entry point still falls back to ZeroPivotAction::ForceAccept on rank-deficient blocks; the sparse-path root supernode runs under may_delay = false for the same reason.

References

The bibliography in references.bib is cited throughout the retrospective and code. Canonical references:

  • Bunch & Kaufman 1977 (BK pivoting)
  • Duff & Reid 1983 (multifrontal method)
  • Amestoy, Davis & Duff 1996 (approximate minimum degree)
  • George & Liu 1981 (elimination trees)
  • Davis 2006 (CHOLMOD, direct methods textbook)
  • Hogg, Reid & Scott 2010 (SSIDS)
  • Wächter & Biegler 2006 (IPOPT interior-point method)

License

FERAL is MIT-licensed. See LICENSE.

Third-party attributions for code incorporated from other open-source projects (notably the SPRAL Hungarian/MC64 scaling routines used in src/scaling/) are recorded in NOTICE, with the full upstream license texts in LICENSE-THIRD-PARTY.