heuropt 0.5.0

A practical Rust toolkit for heuristic single-, multi-, and many-objective optimization.
Documentation

heuropt

Crates.io Documentation Book License: MIT CI

A practical Rust toolkit for heuristic optimization. Single-objective. Multi-objective. Many-objective. 35 algorithms. One small set of traits. Bit-identical seeded determinism. No trait objects, no GATs, no generic-RNG plumbing in the public API.

If you can write a Problem impl and read RandomSearch, you can write your own optimizer. That's the whole pitch.

  • πŸ“– Read the user guide for tutorials, cookbook recipes, comparison with pymoo / hyperopt / MOEA Framework, and stability policy.
  • πŸ”§ API reference on docs.rs has runnable ```rust examples on every algorithm.
  • πŸ§ͺ Tested with 316+ unit / integration / property tests plus 8 cargo-fuzz targets running on every PR.
  • ⚑ Hot paths heavily optimized β€” comparison harness 3.27Γ— faster as of v0.4.0, all bit-identical to the reference output.

Installation

[dependencies]
heuropt = "0.5"

# Optional features:
# - "serde":     derive Serialize/Deserialize on the core data types.
# - "parallel":  evaluate populations across rayon's thread pool.
#                Seeded runs stay bit-identical to serial mode.
# heuropt = { version = "0.5", features = ["serde", "parallel"] }

Define a problem

use heuropt::prelude::*;

struct SchafferN1;

impl Problem for SchafferN1 {
    type Decision = Vec<f64>;

    fn objectives(&self) -> ObjectiveSpace {
        ObjectiveSpace::new(vec![
            Objective::minimize("f1"),
            Objective::minimize("f2"),
        ])
    }

    fn evaluate(&self, x: &Vec<f64>) -> Evaluation {
        let v = x[0];
        Evaluation::new(vec![v * v, (v - 2.0).powi(2)])
    }
}

Run NSGA-II

use heuropt::prelude::*;

# struct SchafferN1;
# impl Problem for SchafferN1 {
#     type Decision = Vec<f64>;
#     fn objectives(&self) -> ObjectiveSpace {
#         ObjectiveSpace::new(vec![Objective::minimize("f1"), Objective::minimize("f2")])
#     }
#     fn evaluate(&self, x: &Vec<f64>) -> Evaluation {
#         Evaluation::new(vec![x[0] * x[0], (x[0] - 2.0).powi(2)])
#     }
# }
let initializer = RealBounds::new(vec![(-5.0, 5.0)]);
let variation = GaussianMutation { sigma: 0.2 };
let config = Nsga2Config { population_size: 60, generations: 80, seed: 42 };
let mut optimizer = Nsga2::new(config, initializer, variation);
let result = optimizer.run(&SchafferN1);

println!("Pareto front size: {}", result.pareto_front.len());

See examples/toy_nsga2.rs for the full version.

Implement a custom optimizer

A new optimizer is just an implementation of Optimizer<P>:

use heuropt::prelude::*;

struct MyOptimizer { /* state */ }

impl<P> Optimizer<P> for MyOptimizer
where
    P: Problem<Decision = Vec<f64>>,
{
    fn run(&mut self, problem: &P) -> OptimizationResult<P::Decision> {
        // Generate candidates.
        // Evaluate them with `problem.evaluate(...)`.
        // Keep the best, or maintain a Pareto archive.
        // Return an OptimizationResult.
        # OptimizationResult::new(
        #     Population::new(Vec::new()),
        #     Vec::new(),
        #     None,
        #     0,
        #     0,
        # )
    }
}

A complete worked example is in examples/custom_optimizer.rs.

Choosing an algorithm

Optimization is a noisy field with a lot of jargon. This section walks you through picking a starting algorithm for a real problem, defining the terms as they come up. If you already know the vocabulary, jump to the quick-reference table at the bottom.

Step 1: What is your problem?

Three ingredients describe any optimization problem:

  • A decision β€” the thing the algorithm is allowed to change. Examples: five real numbers (Vec<f64>), a yes/no flag for each of 100 features (Vec<bool>), or an ordering of cities to visit (Vec<usize>).
  • One or more objectives β€” numbers you want to make small (or large). Examples: a model's prediction error, a tour's total length, a circuit's power draw.
  • An optional set of constraints β€” conditions a decision must satisfy to be valid. Examples: "the budget cannot exceed $1M," or "every car must be visited exactly once."

Your job is to express the problem; heuropt's job is to search for decisions that score well on the objectives without violating the constraints.

Step 2: How many objectives?

The biggest fork in the road. Algorithms specialize sharply by objective count:

  • Single-objective (1) β€” one number to optimize. There's a clear "best" answer. Examples: minimize loss, maximize throughput.
  • Multi-objective (2 or 3) β€” several conflicting goals. There is no single best; instead there is a Pareto front: the set of decisions where you cannot improve any objective without sacrificing another. Each point on the front is a different tradeoff.
  • Many-objective (4+) β€” same idea, but classical multi-objective algorithms break down because almost every pair of points is non-dominated (neither one is strictly better) once you have lots of objectives.

Dominance: Decision A dominates decision B if A is at least as good as B on every objective and strictly better on at least one. The Pareto front is what you get after deleting every dominated decision.

If you found yourself staring at a single composite score that's a weighted sum of conflicting goals, you probably actually have a multi-objective problem in disguise.

Step 3: What does the search space look like?

A few questions about the geometry of your problem:

  • Is the decision continuous (real numbers), discrete (integers, bits), or a permutation (an ordering)?
  • Is the landscape unimodal (one hill, easy to climb) or multimodal (lots of local optima that aren't the global one)? Rastrigin and Ackley are classic multimodal traps.
  • How smooth is it? Smooth landscapes (e.g., a quadratic bowl) reward gradient-like methods (CMA-ES); jagged or noisy ones reward population-based methods (DE, GA).

If you don't know, treat it as multimodal β€” it's the cautious default.

Step 4: How expensive is each evaluation?

Cheap evaluations (a few microseconds β€” pure math, simple simulation) let you afford 100k+ evaluations per run. Expensive evaluations (a training run, a CFD simulation, a real-world measurement that costs money) force you to be sample-efficient: 50–500 evaluations total.

This decides whether you can afford a population-based algorithm that throws hundreds of evaluations at each generation, or whether you need a sample-efficient or multi-fidelity approach:

  • Cheap (1k+ evals affordable): any of the population-based algorithms β€” DE, GA, CMA-ES, NSGA-II, etc.
  • Expensive (50–500 evals): BayesianOpt (Gaussian-process surrogate + Expected Improvement) or Tpe (Parzen-density surrogate, cheaper per step, more robust without hyperparameter tuning).
  • Multi-fidelity (each eval has a tunable budget β€” epochs, sim steps, MC samples): Hyperband. Implement the PartialProblem trait on your problem and Hyperband allocates compute aggressively across promising configs.

The parallel feature flag also matters here β€” if your evaluate function takes more than ~50 Β΅s, enabling rayon-backed parallel population evaluation will speed runs up significantly.

Step 5: Are there hard constraints?

heuropt models constraints as a single scalar constraint violation on each Evaluation. The convention: 0.0 (or negative) means feasible; positive means infeasible, and bigger numbers are worse violations. Every Pareto-comparison and tournament-selection helper in the crate prefers feasible candidates and breaks ties on violation magnitude, so the rule "feasibility comes first" is enforced automatically.

If your constraints are very tight and the search keeps hitting them, you have three options:

  • Repair: implement the Repair<D> trait (or use the provided ClampToBounds / ProjectToSimplex impls) to in-place project infeasible decisions back into the feasible region. Pair with a Variation operator to get bounds-aware variants without writing a custom Variation impl.
  • Stochastic ranking: use stochastic_ranking_select instead of tournament_select_single_objective. It probabilistically explores near-feasibility instead of strict feasibility-first ordering, which helps when feasible regions are narrow.
  • Penalty-only: stick with constraint_violation β€” the simplest, works well when the feasible region is large and convex.

The decision tree

A flow you can run mentally:

START
 β”‚
 β”œβ”€ Is each evaluation EXPENSIVE (>1 sec) or BUDGETED (50–500 total)?
 β”‚   β”‚
 β”‚   β”œβ”€ Yes β†’ sample-efficient regime
 β”‚   β”‚    β”œβ”€ Standard expensive black-box, single-objective
 β”‚   β”‚    β”‚     β†’ BayesianOpt   (GP + Expected Improvement; gold
 β”‚   β”‚    β”‚                      standard *with* per-problem kernel
 β”‚   β”‚    β”‚                      tuning. The default RBF kernel at
 β”‚   β”‚    β”‚                      60 evals is honestly bad β€” give it
 β”‚   β”‚    β”‚                      more evals or tune the kernel.)
 β”‚   β”‚    β”‚     β†’ Tpe           (KDE-based; cheaper per-step,
 β”‚   β”‚    β”‚                      more robust without tuning)
 β”‚   β”‚    β”‚
 β”‚   β”‚    └─ Each eval has a tunable fidelity (epochs, sim steps, …)
 β”‚   β”‚          β†’ Hyperband     (implement PartialProblem; allocates
 β”‚   β”‚                           compute across configs adaptively)
 β”‚   β”‚
 β”‚   └─ No β†’ continue to the population-based branches below
 β”‚
 └─ How many objectives?
     β”‚
     β”œβ”€ 1 (single-objective)
     β”‚    β”‚
     β”‚    β”œβ”€ Decision is Vec<f64> (continuous)
     β”‚    β”‚   β”œβ”€ Smooth landscape (well-conditioned)
     β”‚    β”‚   β”‚     β†’ CmaEs        (full-cov adaptive Gaussian)
     β”‚    β”‚   β”‚     β†’ SeparableNes (cheaper diag-cov; high-dim)
     β”‚    β”‚   β”‚     β†’ NelderMead   (low-dim, deterministic, simple)
     β”‚    β”‚   β”œβ”€ Multimodal landscape
     β”‚    β”‚   β”‚     β†’ IpopCmaEs            (CMA-ES with restart;
     β”‚    β”‚   β”‚                              fixes vanilla CMA-ES's
     β”‚    β”‚   β”‚                              multimodal failure)
     β”‚    β”‚   β”‚     β†’ DifferentialEvolution (rarely beaten on cheap
     β”‚    β”‚   β”‚                              multimodal continuous)
     β”‚    β”‚   β”‚     β†’ SimulatedAnnealing   (cheap & generic)
     β”‚    β”‚   β”œβ”€ Want parameter-free (no F, CR, w, Οƒ to tune)
     β”‚    β”‚   β”‚     β†’ Tlbo
     β”‚    β”‚   β”œβ”€ Want minimum self-adapting baseline
     β”‚    β”‚   β”‚     β†’ OnePlusOneEs          (one-fifth rule,
     β”‚    β”‚   β”‚                               smallest possible ES)
     β”‚    β”‚   β”œβ”€ Just want a strong default for cheap continuous
     β”‚    β”‚   β”‚     β†’ DifferentialEvolution
     β”‚    β”‚   └─ Just want a baseline
     β”‚    β”‚         β†’ RandomSearch
     β”‚    β”‚
     β”‚    β”œβ”€ Decision is Vec<bool> (binary)
     β”‚    β”‚   β”œβ”€ Independent bits, smooth fitness
     β”‚    β”‚   β”‚     β†’ Umda            (per-bit marginal EDA)
     β”‚    β”‚   └─ Bit interactions matter
     β”‚    β”‚         β†’ GeneticAlgorithm with BitFlipMutation +
     β”‚    β”‚           a bit-string crossover
     β”‚    β”‚
     β”‚    β”œβ”€ Decision is Vec<usize> (permutation, e.g., TSP)
     β”‚    β”‚     β†’ AntColonyTsp (with a distance matrix)
     β”‚    β”‚     β†’ TabuSearch  (with your own neighbor function)
     β”‚    β”‚     β†’ SimulatedAnnealing with SwapMutation
     β”‚    β”‚
     β”‚    └─ Custom decision type (a struct, a tree, …)
     β”‚          β†’ SimulatedAnnealing or HillClimber
     β”‚            with your own Variation impl
     β”‚
     β”œβ”€ 2 or 3 (multi-objective)
     β”‚    β”‚
     β”‚    β”œβ”€ Strong default, fast, well-understood
     β”‚    β”‚     β†’ Nsga2
     β”‚    β”‚
     β”‚    β”œβ”€ Real-valued, smooth front, want best convergence
     β”‚    β”‚     β†’ Mopso    (multi-objective PSO; on the benches
     β”‚    β”‚                 here it wins ZDT1 on both HV and
     β”‚    β”‚                 convergence by 100Γ— over the
     β”‚    β”‚                 dominance-based methods)
     β”‚    β”‚
     β”‚    β”œβ”€ Want better front quality than NSGA-II
     β”‚    β”‚     β†’ Ibea     (indicator-based; consistently the best
     β”‚    β”‚                 of the dominance-based methods on these
     β”‚    β”‚                 benches β€” wins ZDT3 HV and DTLZ2 mean
     β”‚    β”‚                 dist by 24Γ—)
     β”‚    β”‚     β†’ Spea2    (strength + density)
     β”‚    β”‚     β†’ SmsEmoa  (hypervolume-contribution selection;
     β”‚    β”‚                 elegant in theory but underperforms
     β”‚    β”‚                 NSGA-II on these benches at our budgets β€”
     β”‚    β”‚                 only worth its higher per-step cost on
     β”‚    β”‚                 fronts where exact HV-contribution is
     β”‚    β”‚                 the right discriminator)
     β”‚    β”‚
     β”‚    β”œβ”€ Want decomposition / weight-vector style
     β”‚    β”‚     β†’ Moead    (very fast per generation, scales well)
     β”‚    β”‚
     β”‚    β”œβ”€ Disconnected or non-convex front
     β”‚    β”‚     β†’ AgeMoea  (estimates front geometry adaptively)
     β”‚    β”‚     β†’ Knea     (favors knee points)
     β”‚    β”‚     β†’ Ibea
     β”‚    β”‚
     β”‚    β”œβ”€ Want region-based diversity
     β”‚    β”‚     β†’ PesaII   (grid hyperboxes drive selection)
     β”‚    β”‚     β†’ EpsilonMoea (Ξ΅-grid archive,
     β”‚    β”‚                     archive size auto-limits)
     β”‚    β”‚
     β”‚    └─ Just one starting decision (no population budget)
     β”‚          β†’ Paes  (1+1 ES with a Pareto archive)
     β”‚
     └─ 4+ (many-objective)
          β”‚
          β”œβ”€ Linear / simplex-shaped front (e.g., DTLZ1)
          β”‚     β†’ Grea       (grid coords drive ranking; on DTLZ1
          β”‚                    here it beats NSGA-III by 3Γ— and
          β”‚                    AGE-MOEA by 2.5Γ—)
          β”‚     β†’ Moead      (decomposition shines on linear fronts;
          β”‚                    second on DTLZ1, also among the
          β”‚                    fastest per generation)
          β”‚
          β”œβ”€ Curved / unknown front geometry
          β”‚     β†’ Nsga3      (reference-point niching, canonical;
          β”‚                    a strong default when the front
          β”‚                    isn't simplex-shaped)
          β”‚     β†’ AgeMoea    (estimates L_p geometry per generation)
          β”‚     β†’ Rvea       (reference vectors with adaptive penalty)
          β”‚
          β”œβ”€ Want indicator-based selection
          β”‚     β†’ Ibea       (additive Ξ΅-indicator; doesn't degrade
          β”‚                   at high obj count)
          β”‚     β†’ Hype       (Monte Carlo HV estimation; scales
          β”‚                   to arbitrary M)

Quick reference

Sample-efficient / expensive evaluation (50–500 evals):

Algorithm Objectives Decision Strengths
BayesianOpt 1 Vec<f64> GP surrogate + EI; gold standard with per-problem kernel tuning (default RBF at 60 evals is honestly bad)
Tpe 1 Vec<f64> KDE surrogate; robust without hyperparameter tuning
Hyperband 1 any multi-fidelity; needs PartialProblem

Single-objective continuous (Vec<f64>):

Algorithm Strengths
RandomSearch sanity baseline
HillClimber simplest greedy local search
OnePlusOneEs one-fifth-rule self-adapting baseline
SimulatedAnnealing escapes local optima
GeneticAlgorithm classic SO GA with elitism
ParticleSwarm simple swarm baseline
DifferentialEvolution strong default for cheap continuous
Tlbo parameter-free (no F, CR, w, Οƒ)
CmaEs smooth landscapes; full covariance
IpopCmaEs CMA-ES + restart for multimodal
SeparableNes diagonal-cov NES; cheap per-step
NelderMead classical simplex; deterministic

Single-objective other decision types:

Algorithm Decision Strengths
Umda Vec<bool> independent-bit EDA
TabuSearch any discrete, you supply neighbors
AntColonyTsp Vec<usize> TSP / permutation

Multi-objective (2–3) and many-objective (4+):

Algorithm Objectives Strengths
Paes 2–3 1+1 ES with Pareto archive
Nsga2 2–3 canonical Pareto-based EA
Spea2 2–3 strength + density
Mopso 2–3 multi-objective PSO; best convergence on smooth real-valued 2-obj fronts
Ibea 2+ indicator-based; consistently best of the dominance-based methods
SmsEmoa 2+ exact HV-contribution selection; high per-step cost, modest gain
Hype 2+ Monte Carlo HV estimation
EpsilonMoea 2+ Ξ΅-grid archive; auto-sized
PesaII 2+ grid-based region selection
AgeMoea 2+ adaptive front-geometry estimation
Knea 2+ knee-point favored survival
Moead 2+ decomposition; fast per-gen
Nsga3 4+ reference-point niching; strong on curved fronts
Rvea 4+ reference vectors with penalty
Grea 4+ grid coords drive selection; particularly strong on linear/simplex fronts

Current algorithms

The full list with one-line descriptions:

Sample-efficient / multi-fidelity:

  • BayesianOpt β€” Gaussian-process surrogate + Expected Improvement.
  • Tpe β€” Bergstra et al. 2011 Tree-structured Parzen Estimator.
  • Hyperband β€” Li et al. 2017 multi-fidelity (uses PartialProblem).

Single-objective:

  • RandomSearch β€” sample-evaluate-keep baseline.
  • HillClimber β€” greedy single-step local search.
  • OnePlusOneEs β€” Rechenberg 1973 (1+1)-ES with one-fifth rule.
  • SimulatedAnnealing β€” Kirkpatrick et al. 1983, generic over decision type.
  • TabuSearch β€” Glover 1986, with a user-supplied neighbor generator.
  • GeneticAlgorithm β€” generational GA with tournament selection + elitism.
  • ParticleSwarm β€” Eberhart & Kennedy 1995 PSO for Vec<f64>.
  • DifferentialEvolution β€” Storn & Price DE/rand/1/bin for Vec<f64>.
  • Tlbo β€” Rao 2011 Teaching-Learning-Based Optimization (parameter-free).
  • CmaEs β€” Hansen & Ostermeier 2001 covariance-matrix adaptation.
  • IpopCmaEs β€” Auger & Hansen 2005 CMA-ES with restart, for multimodal.
  • SeparableNes β€” Wierstra et al. 2008/2014 diagonal-cov NES.
  • NelderMead β€” Nelder & Mead 1965 simplex direct search.
  • Umda β€” MΓΌhlenbein 1997 univariate marginal-distribution EDA for Vec<bool>.
  • AntColonyTsp β€” Dorigo Ant System for permutation problems.

Multi-objective:

  • Paes β€” Knowles & Corne 1999 Pareto Archived Evolution Strategy.
  • Nsga2 β€” Deb et al. 2002, the canonical Pareto-based EA.
  • Spea2 β€” Zitzler, Laumanns & Thiele 2001 strength-Pareto EA.
  • Moead β€” Zhang & Li 2007 decomposition-based MOEA with Tchebycheff scalarization.
  • Mopso β€” Coello, Pulido & Lechuga 2004 multi-objective PSO.
  • Ibea β€” Zitzler & KΓΌnzli 2004 indicator-based EA.
  • SmsEmoa β€” Beume, Naujoks & Emmerich 2007 hypervolume-selection EMOA.
  • Hype β€” Bader & Zitzler 2011 Hypervolume Estimation Algorithm.
  • EpsilonMoea β€” Deb, Mohan & Mishra 2003 Ξ΅-dominance MOEA.
  • PesaII β€” Corne et al. 2001 Pareto Envelope Selection II.
  • AgeMoea β€” Panichella 2019 Adaptive Geometry Estimation MOEA.
  • Knea β€” Zhang, Tian & Jin 2015 Knee point-driven EA.

Many-objective (4+):

  • Nsga3 β€” Deb & Jain 2014 reference-point NSGA-III.
  • Rvea β€” Cheng et al. 2016 Reference Vector-guided EA.
  • Grea β€” Yang et al. 2013 Grid-based EA.

Reusable utilities: pareto_compare, pareto_front, best_candidate, non_dominated_sort, crowding_distance, ParetoArchive, das_dennis, and the metrics spacing and hypervolume_2d.

Design philosophy

  • Concrete data, small trait surface. Problem, Optimizer, Initializer, Variation are the only traits a user interacts with day-to-day. Everything else is plain structs.
  • No type hell. No trait objects in the core path, no GATs, no HRTBs in user-facing APIs, no generic-RNG plumbing β€” Rng is a single concrete type alias.
  • Readable algorithms. Built-ins are written for clarity, not maximum abstraction reuse. RandomSearch is the recommended file to read before writing your own optimizer.
  • One crate first. No premature splitting into -core/-algorithms/ -operators. Split later if the crate grows.
  • Panic on programmer error. Invalid configuration panics with a clear message in v1; the API may grow Result-returning variants later if the base API proves useful.

See docs/heuropt_tech_design_spec.md for the full design rationale.

Testing

heuropt is exhaustively tested across several layers:

  • Unit + integration tests (cargo test) β€” 313 tests covering every algorithm, operator, metric, Pareto utility, and edge case (empty/singleton/duplicate populations, flat fitness, zero-width bounds, infeasible-only populations).
  • Property-based tests (proptest) β€” bounds preservation, Pareto antisymmetry/reflexivity, partition correctness, determinism, and seed-stability checks for every algorithm.
  • Coverage-guided fuzzing (cargo +nightly fuzz run <target>) β€” eight targets at fuzz/fuzz_targets/, soaked for 60 s per target in CI on every PR.
  • Instruction-count benchmarks (cargo bench) β€” gungraun (callgrind) hot-path benchmarks for every algorithm and Pareto utility, machine-stable so PR-level regressions show up.
  • Mutation testing (cargo mutants) β€” advisory; config at .cargo/mutants.toml.
  • CI (.github/workflows/ci.yml) β€” fmt, clippy (-D warnings), test (4-feature matrix), doc, MSRV (1.85), fuzz.

Contributing

See CONTRIBUTING.md for the local-test checklist, conventional-commits requirement, and project-governance docs.

This project follows the Builder's Code of Conduct: stay professional, stay technical, focus on the work and its merit.

For security disclosures, see SECURITY.md.

License

MIT β€” see LICENSE.

Changelog

See CHANGELOG.md.