Crate varpro[−][src]
The varpro crate enables nonlinear least squares fitting for separable models using the Variable Projection (VarPro) algorithm.
Introduction
A large class of nonlinear models consists of a mixture of truly nonlinear as well as linear model
parameters. These are the so called separable models which can be written as a linear combination
of $N_{basis}$
nonlinear model basis functions. The purpose of this crate provide a simple interface to
robust and fast routines to fit separable models to data. Consider a data vector $\vec{y}= (y_1,\dots,y_{N_{data}})^T$
, which
is sampled at grid points $\vec{x}=(x_1,\dots,x_{N_{data}})^T$
, both with $N_{data}$
elements. Our goal is to fit a nonlinear,
separable model to the data. Because the model is separable we can write it as
\vec{f}(\vec{x},\vec{\alpha},\vec{c}) = \sum_{j=1}^{N_{basis}} c_j \vec{f}_j(\vec{x},S_j(\alpha))
Lets look at the components of this equation in more detail. The vector valued function
$\vec{f}(\vec{x},\vec{\alpha},\vec{c})$
is the actual model we want to fit. It depends on
three arguments:
$\vec{x}$
is the independent variable which corresponds to the grid points. It can be a time, location or anything else.$\vec{\alpha}=(\alpha_1,\dots,\alpha_{N_{params}})^T$
is the vector of nonlinear model parameters. We will get back to these later.$\vec{c}=(c_1,\dots,c_{N_{basis}})^T$
is the vector of coefficients for the basis functions.
Model Parameters and Basis Functions
The model itself is given as a linear combination of nonlinear basis functions $\vec{f}_j$
with
expansion coefficient $c_j$
. The basis functions themselves only depend on the independent variable
$\vec{x}$
and on a subset $S_j(\alpha)$
of the nonlinear model parameters $\vec{\alpha}$
.
Each basis function can depend on a different subset.
What VarPro Computes
This crate finds the parameters $\vec{\alpha}$
and $\vec{c}$
that
fit the model $\vec{f}(\vec{x},\vec{\alpha},\vec{c})$
to the data $\vec{y}$
obtained at grid points
$\vec{x}$
using a least squares metric. Formally, varpro finds
\arg\min_{\vec{\alpha},\vec{c}} ||\mathbf{W}(\vec{y}-\vec{f}(\vec{x},\vec{\alpha},\vec{c}))||_2^2,
where $\mathbf{W}$
is a weight matrix that can be set to the identity matrix for unweighted
least squares.
What Makes VarPro Special
The Variable Projection algorithm takes advantage of the fact that the model is a mix of linear
and nonlinear parts. VarPro uses a clever algorithmic trick
to cast the minimization into a problem that only depends on the nonlinear model parameters
$\vec{\alpha}$
and lets a nonlinear optimization backend handle this reduced problem. This crate
uses the levenberg_marquardt crate as it’s optimization backend. Other
optimization backends are planned for future releases.
The VarPro algorithm implemented here follows (O’Leary2013), but does use the Kaufman approximation to calculate the Jacobian.
Usage and Workflow
The workflow for solving a least squares fitting problem with varpro is consists of the following steps.
- Create a SeparableModel which describes the model function using the SeparableModelBuilder. This is done by adding individual basis functions as well as their partial derivatives.
- Choose a nonlinear minimizer backend. Right now the only implemented nonlinear minimizer is the Levenberg-Marquardt algorithm using the levenberg_marquardt crate. So just proceed to the next step.
- Cast the fitting problem into a LevMarProblem using the LevMarProblemBuilder.
- Solve the fitting problem using the LevMarSolver, which is an alias for the LevenbergMarquardt and allows to set additional parameters of the algorithm before performing the minimization.
- Check the minimization report and, if successful, retrieve the nonlinear parameters
$\alpha$
using the LevMarProblem::params and the linear coefficients$\vec{c}$
using LevMarProblem::linear_coefficients
Example
Preliminaries
The following example demonstrates how to fit a double exponential decay model with constant offset
to observations $\vec{y}$
obtained at grid points $\vec{x}$
. The model itself is a vector valued
function with the same number of elements as $\vec{x}$
and $\vec{y}$
. The component at index
k
is given by
f_k(\vec{x},\vec{\alpha},\vec{c})= c_1 \exp\left(-x_k/\tau_1\right)+c_2 \exp\left(-x_k/\tau_2\right)+c_3,
which a fancy way of saying that the exponential functions are applied element-wise to the vector $\vec{x}$
.
We can see that the model depends on the nonlinear parameters $\vec{\alpha}=(\tau_1,\tau_2)^T$
and the linear coefficients $\vec{c}=(c_1,c_2,c_3)^T$
. Both exponential functions can be modeled
as an exponential decay with signature $\vec{f}_j(\vec{x},\tau)$
. In varpro, the basis functions
must take $\vec{x}$
by reference and a number of parameters as further scalar arguments. So
the decay is implemented as:
use nalgebra::DVector; fn exp_decay(x :&DVector<f64>, tau : f64) -> DVector<f64> { x.map(|x|(-x/tau).exp()) }
For our separable model we also need the partial derivatives of the basis functions with
respect to all the parameters that each basis function depends on. Thus, in
this case we have to provide $\partial/\partial\tau\,\vec{f}_j(\vec{x},\tau)$
.
use nalgebra::DVector; fn exp_decay_dtau(tvec: &DVector<f64>,tau: f64) -> DVector<f64> { tvec.map(|t| (-t / tau).exp() * t / tau.powi(2)) }
We’ll see in the example how the function method and the partial_deriv methods let us add the function and the derivative as base functions.
There is a second type of basis function, which corresponds to coefficient $c_3$
. This is a constant
function returning a vector of ones. It does not depend on any parameters, which is why there
is a separate way of adding these types of invariant functions to the model. For that use
invariant_function
of the SeparableModelBuilder.
Example Code
Using the functions above our example code of fitting a linear model to a vector of data y
obtained
at grid points x
looks like this:
use nalgebra::DVector; use varpro::prelude::*; use varpro::solvers::levmar::{LevMarProblemBuilder, LevMarSolver}; //1. create the model by giving only the nonlinear parameter names it depends on let model = SeparableModelBuilder::<f64>::new(&["tau1", "tau2"]) // add the first exponential decay and its partial derivative to the model // give all parameter names that the function depends on // and subsequently provide the partial derivative for each parameter .function(&["tau1"], exp_decay) .partial_deriv("tau1", exp_decay_dtau) // add the second exponential decay and its partial derivative to the model .function(&["tau2"], exp_decay) .partial_deriv("tau2", exp_decay_dtau) // add the constant as a vector of ones as an invariant function .invariant_function(|x|DVector::from_element(x.len(),1.)) // build the model .build() .unwrap(); // 2.,3: Cast the fitting problem as a nonlinear least squares minimization problem let problem = LevMarProblemBuilder::new() .model(&model) .x(x) .y(y) .initial_guess(&[1., 2.]) .build() .expect("Building valid problem should not panic"); // 4. Solve using the fitting problem let (solved_problem, report) = LevMarSolver::new().minimize(problem); assert!(report.termination.was_successful()); // the nonlinear parameters after fitting // they are in the same order as the parameter names given to the model let alpha = solved_problem.params(); // the linear coefficients after fitting // the are in the same order as the basis functions that were added to the model let c = solved_problem.linear_coefficients().unwrap();
References and Further Reading
(O’Leary2013) O’Leary, D.P., Rust, B.W. Variable projection for nonlinear least squares problems. Comput Optim Appl 54, 579–593 (2013). DOI: 10.1007/s10589-012-9492-9
attention: the O’Leary paper contains errors that are fixed in this blog article of mine.
(Golub2003) Golub, G. , Pereyra, V Separable nonlinear least squares: the variable projection method and its applications. Inverse Problems 19 R1 (2003) https://iopscience.iop.org/article/10.1088/0266-5611/19/2/201
Modules
basis_function | |
model | |
prelude | |
solvers |