Expand description
Metrics collect maximums, averages etc, for use as convergence criteria.
§Overview
Often with stepwise algorithms, you will want to terminate early based on convergence criteria, but batch gradients on noisy data, or stationary points will often mean a single spot check of gradient or a spot check of step-size is unreliable for early termination.
Instead use an exponential moving average of the l2 gradient norm,
or a MovingSpread
of cost-function values (similar to PyTorch’s patience
)
These metrics need to be mutably declared before the driver logic as they hold values between iterations, but will likely
be used within Driver::converge_when
.
All the metrics implement the Metric
trait.
§Example 1:
This first example uses the Delta
metric, which tracks changes between current and prior iteration,
so see when the l2-distance between solution values is within a tolerance.
Notice:
- The collected type needs to be an
Owned
type, hencesolv.x().to_vec()
- The Delta metric takes a closure saying how to calculate distance between vectors at time t, and time t-1
- This closure’s ref-parameter types must match the observe call (
Vec<f64>
matches&Vec<f64>
) - On the first iteration, when there is no prior t-1 vector,
observe
returnsf64::NAN
which will always compare false.
let gd = algos::GradientDescent::new(0.01, vec![1.0, 2.0], problems::sphere_grad);
let dist_fn = |x_prev: &Vec<f64>, x: &Vec<f64>| x.sub_vec(x_prev).norm_l2();
let mut delta_x = metrics::Delta::new(dist_fn);
let (solved, step) = fixed_iters(gd, 10_000)
.converge_when(|algo, _step| delta_x.observe( algo.x().to_vec() ) < 1e-12)
.solve()
.unwrap();
assert_approx_eq!(solved.x(), &[0.0, 0.0]);
assert_eq!(step.iteration(), 1215);
§Example 2:
This example tracks the exponential moving average (EMA) of the gradient norm (with a window of 10), using the Ema
metric.
For the first 9 iterations Ema::observe
will return f64::NAN
as not enough samples will have been collected.
Note: f64::NAN < anything
is always false, so the convergence test is not trigger on the first 9 iterations.
// terminate when exp moving average of algo gradient has l2-norm near zero
let mut avg_of_norm = metrics::Ema::with_window(10);
let (solved, step) = fixed_iters(gd, 10_000)
.converge_when(|algo, _step| avg_of_norm.observe( algo.gradient.norm_l2() ) < 1e-12)
.solve()
.unwrap();
assert_approx_eq!(solved.x(), &[0.0, 0.0]);
assert_eq!(step.iteration(), 1448);
§Counter-example
Don’t do this!
Constructing the metric afresh each iteration means it will only ever observe one item, and always report NAN
.
Metrics need to be declared up front, before any iteration.
let (_solved, step) = fixed_iters(algo, 10_000)
.converge_when(|algo, _step| metrics::DeltaFloat::new().observe( algo.x()[0] ) < 1e-8)
.solve()
.unwrap();
assert!(step.iteration() < 10_000);
Structs§
- Delta
- Calculates a delta difference between the current and the last value seen.
- Delta
Float - Simple metric for absolute difference of 1-dimentional f32 or f64’s
- Ema
- Exponential moving average.
- Emv
- Exponential moving variance. An estimate of the (max-min) spread for the last N items. (Work in progress!)
- Grad
L2Norm - Invocation
Count - Counts the number of invocations of a function.
- Moving
Avg - Records the average value in the last N values
- Moving
Max - Records the maximum value in the last N values
- Moving
Spread - Calculates the spread, the (max-min) in a moving window.
- Moving
Stats - Stats
Enums§
- Never
- Functions taking
Never
as an argument can never be called directly.
Traits§
- Float
- Metric
- A common interface accross metrics, with an invaluable
Metric::observe
method. - Stateful
Metric