ganesh/lib.rs
1//! `ganesh` (/ɡəˈneɪʃ/), named after the Hindu god of wisdom, provides several common minimization algorithms as well as a straightforward, trait-based interface to create your own extensions. This crate is intended to be as simple as possible. For most minimization problems user needs to implement the [`CostFunction`](crate::traits::cost_function::CostFunction) trait on some struct which will take a vector of parameters and return a single-valued `Result` ($`f(\mathbb{R}^n) \to \mathbb{R}`$). Some algorithms require a gradient which can be implemented via the [`Gradient`](crate::traits::cost_function::Gradient) trait. While users may provide an analytic gradient function to speed up some algorithms, this trait comes with a default central finite-difference implementation so that all algorithms will work out of the box as long as the cost function is well-defined.
2//!
3//! # Table of Contents
4//! - [Key Features](#key-features)
5//! - [Quick Start](#quick-start)
6//! - [Algorithms](#algorithms)
7//! - [Examples](#examples)
8//! - [Bounds](#bounds)
9//! - [Future Plans](#future-plans)
10//! - [Citations](#citations)
11//!
12//! ## Key Features
13//! * Algorithms that are simple to use with sensible defaults.
14//! * Traits which make developing future algorithms simple and consistent.
15//! * A simple interface that lets new users get started quickly.
16//! * The first (and possibly only) pure Rust implementation of the [`L-BFGS-B`](crate::algorithms::gradient::lbfgsb::LBFGSB) algorithm.
17//!
18//! ## Quick Start
19//!
20//! This crate provides some common test functions in the [`test_functions`](crate::test_functions) module. Consider the following implementation of the Rosenbrock function:
21//!
22//! ```rust
23//! use ganesh::traits::*;
24//! use ganesh::{Float, DVector};
25//! use std::convert::Infallible;
26//!
27//! pub struct Rosenbrock {
28//! pub n: usize,
29//! }
30//! impl CostFunction for Rosenbrock {
31//! fn evaluate(&self, x: &DVector<Float>, _args: &()) -> Result<Float, Infallible> {
32//! Ok((0..(self.n - 1))
33//! .map(|i| 100.0 * (x[i + 1] - x[i].powi(2)).powi(2) + (1.0 - x[i]).powi(2))
34//! .sum())
35//! }
36//! }
37//! ```
38//! To minimize this function, we could consider using the Nelder-Mead algorithm:
39//! ```rust
40//! use ganesh::algorithms::gradient_free::{NelderMead, NelderMeadConfig};
41//! use ganesh::traits::*;
42//! use ganesh::{Float, DVector};
43//! use std::convert::Infallible;
44//!
45//! # pub struct Rosenbrock {
46//! # pub n: usize,
47//! # }
48//! # impl CostFunction for Rosenbrock {
49//! # fn evaluate(&self, x: &DVector<Float>, _args: &()) -> Result<Float, Infallible> {
50//! # Ok((0..(self.n - 1))
51//! # .map(|i| 100.0 * (x[i + 1] - x[i].powi(2)).powi(2) + (1.0 - x[i]).powi(2))
52//! # .sum())
53//! # }
54//! # }
55//! fn main() -> Result<(), Infallible> {
56//! let problem = Rosenbrock { n: 2 };
57//! let mut nm = NelderMead::default();
58//! let result = nm.process(&problem,
59//! &(),
60//! NelderMeadConfig::new([2.0, 2.0]),
61//! NelderMead::default_callbacks())?;
62//! println!("{}", result);
63//! Ok(())
64//! }
65//! ```
66//!
67//! This should output
68//! ```shell
69//! ╭──────────────────────────────────────────────────────────────────╮
70//! │ │
71//! │ FIT RESULTS │
72//! │ │
73//! ├───────────┬───────────────────┬────────────────┬─────────────────┤
74//! │ Status │ f(x) │ #f(x) │ #∇f(x) │
75//! ├───────────┼───────────────────┼────────────────┼─────────────────┤
76//! │ Converged │ 0.00023 │ 76 │ 0 │
77//! ├───────────┼───────────────────┴────────────────┴─────────────────┤
78//! │ │ │
79//! │ Message │ term_f = STDDEV │
80//! │ │ │
81//! ├───────────┴─────────────────────────────┬────────────┬───────────┤
82//! │ Parameter │ Bound │ At Limit? │
83//! ├───────────┬─────────┬─────────┬─────────┼──────┬─────┼───────────┤
84//! │ │ = │ σ │ 0 │ - │ + │ │
85//! ├───────────┼─────────┼─────────┼─────────┼──────┼─────┼───────────┤
86//! │ x_0 │ 1.00081 │ 0.84615 │ 2.00000 │ -inf │ inf │ No │
87//! │ x_1 │ 1.00313 │ 1.69515 │ 2.00000 │ -inf │ inf │ No │
88//! ╰───────────┴─────────┴─────────┴─────────┴──────┴─────┴───────────╯
89//! ```
90//!
91//! ## Algorithms
92//!
93//! At the moment, `ganesh` contains the following [`Algorithm`](crate::traits::algorithm::Algorithm)s:
94//! - Gradient descent/quasi-Newton:
95//! - [`L-BFGS-B`](crate::algorithms::gradient::lbfgsb::LBFGSB)
96//! - [`Adam`](crate::algorithms::gradient::adam::Adam) (for stochastic [`CostFunction`](crate::traits::cost_function::CostFunction)s)
97//! - Gradient-free:
98//! - [`Nelder-Mead`](crate::algorithms::gradient_free::nelder_mead::NelderMead)
99//! - [`Simulated Annealing`](crate::algorithms::gradient_free::simulated_annealing::SimulatedAnnealing)
100//! - Markov Chain Monte Carlo (MCMC):
101//! - [`AIES`](crate::algorithms::mcmc::aies::AIES)
102//! - [`ESS`](crate::algorithms::mcmc::ess::ESS)
103//! - Swarms:
104//! - [`PSO`](crate::algorithms::particles::pso::PSO) (a basic form of particle swarm optimization)
105//!
106//! All algorithms are written in pure Rust, including [`L-BFGS-B`](crate::algorithms::gradient::lbfgsb::LBFGSB), which is typically a binding to
107//! `FORTRAN` code in other crates.
108//!
109//! ## Examples
110//!
111//! More examples can be found in the `examples` directory of this project. They all contain a
112//! `.justfile` which allows the whole example to be run with the command, [`just`](https://github.com/casey/just).
113//! To just run the Rust-side code and skip the Python visualization, any of the examples can be run with
114//!
115//! ```shell
116//! cargo r -r --example <example_name>
117//! ```
118//!
119//! ## Bounds
120//! All [`Algorithm`](crate::traits::algorithm::Algorithm)s in `ganesh` can be constructed to have access to a feature which allows algorithms which usually function in unbounded parameter spaces to only return results inside a bounding box. This is done via a parameter transformation, similar to that used by [`LMFIT`](https://lmfit.github.io/lmfit-py/) and [`MINUIT`](https://root.cern.ch/doc/master/classTMinuit.html). This transform is not directly useful with algorithms which already have bounded implementations, like [`L-BFGS-B`](crate::algorithms::gradient::lbfgsb::LBFGSB), but it can be combined with other transformations which may be useful to algorithms with bounds. While the user inputs parameters within the bounds, unbounded algorithms can (and in practice will) convert those values to a set of unbounded "internal" parameters. When functions are called, however, these internal parameters are converted back into bounded "external" parameters, via the following transformations:
121//!
122//! Upper and lower bounds:
123//! ```math
124//! x_\text{int} = \frac{u}{\sqrt{1 - u^2}}
125//! ```
126//! ```math
127//! x_\text{ext} = c + w \frac{x_\text{int}}{\sqrt{x_\text{int}^2 + 1}}
128//! ```
129//! where
130//! ```math
131//! u = \frac{x_\text{ext} - c}{w},\ c = \frac{x_\text{min} + x_\text{max}}{2},\ w = \frac{x_\text{max} - x_\text{min}}{2}
132//! ```
133//! Upper bound only:
134//! ```math
135//! x_\text{int} = \frac{1}{2}\left(\frac{1}{(x_\text{max} - x_\text{ext})} - (x_\text{max} - x_\text{ext}) \right)
136//! ```
137//! ```math
138//! x_\text{ext} = x_\text{max} - (\sqrt{x_\text{int}^2 + 1} - x_\text{int})
139//! ```
140//! Lower bound only:
141//! ```math
142//! x_\text{int} = \frac{1}{2}\left((x_\text{ext} - x_\text{min}) - \frac{1}{(x_\text{ext} - x_\text{min})} \right)
143//! ```
144//! ```math
145//! x_\text{ext} = x_\text{min} + (\sqrt{x_\text{int}^2 + 1} + x_\text{int})
146//! ```
147//! While `MINUIT` and `LMFIT` recommend caution in interpreting covariance matrices obtained from
148//! fits with bounds transforms, `ganesh` does not, since it implements higher-order derivatives on
149//! these bounds while these other libraries use linear approximations.
150//!
151//! ## Future Plans
152//!
153//! * Eventually, I would like to implement some more modern gradient-free optimization techniques.
154//! * There are probably many optimizations and algorithm extensions that I'm missing right now.
155//! * There should be more tests and documentation (as usual).
156//!
157//! ## Citations
158//! While this project does not currently have an associated paper, most of the algorithms it implements do, and they should be cited appropriately. Citations are also generally available in the documentation.
159//!
160//! ### ESS MCMC Sampler
161//! ```text
162//! @article{karamanis2020ensemble,
163//! title = {Ensemble slice sampling: Parallel, black-box and gradient-free inference for correlated & multimodal distributions},
164//! author = {Karamanis, Minas and Beutler, Florian},
165//! journal = {arXiv preprint arXiv: 2002.06212},
166//! year = {2020}
167//! }
168//! ```
169//!
170//! ### scikit-learn (used in constructing a Bayesian Mixture Model in the Global ESS step)
171//! ```text
172//! @article{scikit-learn,
173//! title={Scikit-learn: Machine Learning in {P}ython},
174//! author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
175//! and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P.
176//! and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and
177//! Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.},
178//! journal={Journal of Machine Learning Research},
179//! volume={12},
180//! pages={2825--2830},
181//! year={2011}
182//! }
183//! ```
184//!
185//! ### AIES MCMC Sampler
186//! ```text
187//! @article{Goodman2010,
188//! title = {Ensemble samplers with affine invariance},
189//! volume = {5},
190//! ISSN = {1559-3940},
191//! url = {http://dx.doi.org/10.2140/camcos.2010.5.65},
192//! DOI = {10.2140/camcos.2010.5.65},
193//! number = {1},
194//! journal = {Communications in Applied Mathematics and Computational Science},
195//! publisher = {Mathematical Sciences Publishers},
196//! author = {Goodman, Jonathan and Weare, Jonathan},
197//! year = {2010},
198//! month = jan,
199//! pages = {65–80}
200//! }
201//! ```
202#![warn(
203 clippy::nursery,
204 clippy::unwrap_used,
205 clippy::expect_used,
206 clippy::doc_markdown,
207 clippy::doc_link_with_quotes,
208 clippy::missing_safety_doc,
209 clippy::missing_panics_doc,
210 clippy::missing_errors_doc,
211 clippy::perf,
212 clippy::style,
213 missing_docs
214)]
215
216/// Module containing core functionality.
217pub mod core;
218
219/// Module containing all traits.
220pub mod traits;
221
222/// Module containing various minimization algorithms.
223pub mod algorithms;
224
225/// Module containing standard functions for testing algorithms.
226pub mod test_functions;
227
228/// A floating-point number type (defaults to [`f64`], see `f32` feature).
229#[cfg(not(feature = "f32"))]
230pub type Float = f64;
231
232/// A floating-point number type (defaults to [`f64`], see `f32` feature).
233#[cfg(feature = "f32")]
234pub type Float = f32;
235
236/// Re-export some useful `nalgebra` types for convenience.
237pub use nalgebra;
238pub use nalgebra::{DMatrix, DVector};
239
240/// The mathematical constant $`\pi`$.
241#[cfg(not(feature = "f32"))]
242pub const PI: Float = std::f64::consts::PI;
243
244/// The mathematical constant $`\pi`$.
245#[cfg(feature = "f32")]
246pub const PI: Float = std::f32::consts::PI;