Module easy_ml::differentiation::usage

source ·
Expand description

§Usage of Record and Trace

Both Trace and Record for forward and reverse automatic differentiation respectively implement Numeric and can generally be treated as normal numbers just like f32 and f64.

Trace is literally implemented as a dual number, and is more or less a one to one substitution. Record requires dynamically building a computational graph of the values and dependencies of each operation performed on them. This means performing operations on records have side effects, they add entries onto a WengertList. However, when using Record the side effects are abstracted away, just create a WengertList before you start creating Records.

Given some function from N inputs to M outputs you can pass it Traces or Records and retrieve the first derivative from the outputs for all combinations of N and M. If N >> M then you should use Record as reverse mode automatic differentiation is much cheaper. If N << M then you should use Trace as it will be much cheaper. If you have large N and M, or small N and M, you might have to benchmark to find which method works best. However, most problems are N > M.

For this example we use a function which takes two inputs, r and a, and returns two outputs, x and y.

§Using Trace

use easy_ml::differentiation::Trace;
use easy_ml::numeric::extra::Cos;
use easy_ml::numeric::extra::Sin;
fn cartesian(r: Trace<f32>, angle: Trace<f32>) -> (Trace<f32>, Trace<f32>) {
    let x = r * angle.cos();
    let y = r * angle.sin();
    (x, y)
}
// first find dx/dr and dy/dr
let (x, y) = cartesian(Trace::variable(1.0), Trace::constant(2.0));
let dx_dr = x.derivative;
let dy_dr = y.derivative;
// now find dx/da and dy/da
let (x, y) = cartesian(Trace::constant(1.0), Trace::variable(2.0));
let dx_da = x.derivative;
let dy_da = y.derivative;

§Using Record

use easy_ml::differentiation::{Record, WengertList};
use easy_ml::numeric::extra::{Cos, Sin};
// the lifetimes tell the rust compiler that our inputs and outputs
// can all live as long as the WengertList
fn cartesian<'a>(
    r: Record<'a, f32>,
    angle: Record<'a, f32>
) -> (Record<'a, f32>, Record<'a, f32>) {
    let x = r * angle.cos();
    let y = r * angle.sin();
    (x, y)
}
// first we must construct a WengertList to create records from
let list = WengertList::new();
let r = Record::variable(1.0, &list);
let a = Record::variable(2.0, &list);
let (x, y) = cartesian(r, a);
// first find dx/dr and dx/da
let x_derivatives = x.derivatives();
let dx_dr = x_derivatives[&r];
let dx_da = x_derivatives[&a];
// now find dy/dr and dy/da
let y_derivatives = y.derivatives();
let dy_dr = y_derivatives[&r];
let dy_da = y_derivatives[&a];

§Using Record container

use easy_ml::differentiation::{Record, RecordTensor, WengertList};
use easy_ml::numeric::extra::{Cos, Sin};
use easy_ml::tensors::Tensor;

// the lifetimes tell the rust compiler that our inputs and outputs
// can all live as long as the WengertList
fn cartesian<'a>(
    r: Record<'a, f32>,
    angle: Record<'a, f32>
) -> [Record<'a, f32>; 2] {
    let x = r * angle.cos();
    let y = r * angle.sin();
    [x, y]
}
// first we must construct a WengertList to create records from
let list = WengertList::new();
// for this example we also calculate derivatives for 1.5 and 2.5 since you wouldn't use
// RecordTensor if you only had a single variable input
let R = RecordTensor::variables(&list, Tensor::from([("radius", 2)], vec![ 1.0, 1.5 ]));
let A = RecordTensor::variables(&list, Tensor::from([("angle", 2)], vec![ 2.0, 2.5 ]));
let (X, Y) = {
    let [resultX, resultY] = RecordTensor::from_iters(
        [("z", 2)],
        R.iter_as_records()
            .zip(A.iter_as_records())
            // here we operate on each pair of Records as in the prior example, except
            // we have to convert to arrays to stream the collection back into RecordTensors
            .map(|(r, a)| cartesian(r, a))
    );
    // we know we can unwrap for this example because we know each iterator contained 2
    // elements which matches the shape we're converting back to
    (resultX.unwrap(), resultY.unwrap())
};
// first find dX/dR and dX/dA, we can unwrap because we know X and Y are variables rather than
// constants.
let X_derivatives = X.derivatives().unwrap();
let dX_dR = X_derivatives.map(|d| d.at_tensor(&R));
let dX_dA = X_derivatives.map(|d| d.at_tensor(&A));
// now find dY/dR and dY/dA
let Y_derivatives = Y.derivatives().unwrap();
let dY_dR = Y_derivatives.map(|d| d.at_tensor(&R));
let dY_dA = Y_derivatives.map(|d| d.at_tensor(&A));

§Differences

Notice how in the above examples all the same 4 derivatives are found, but in forward mode we rerun the function with a different input as the sole variable, the rest as constants, whereas in reverse mode we rerun the derivatives() function on a different output variable. With Reverse mode we would only pass constants into the cartesian function if we didn’t want to get their derivatives (and avoid wasting memory on something we didn’t need).

Storing matrices, tensors or vecs of Records can be inefficienct as it stores the history for each record even when they are the same. Instead, a RecordTensor or RecordMatrix can be used, either directly with their elementwise APIs and trait implementations or manipulated as an iterator of Records then collected back into a RecordTensor or RecordMatrix with RecordContainer::from_iter and RecordContainer::from_iters

§Substitution

There is no need to rewrite the input functions, as you can use the Numeric and Real traits to write a function that will take floating point numbers, Traces and Records.

use easy_ml::differentiation::{Trace, Record, WengertList};
use crate::easy_ml::numeric::Numeric;
use crate::easy_ml::numeric::extra::{Real};
fn cartesian<T: Numeric + Real + Copy>(r: T, angle: T) -> (T, T) {
    let x = r * angle.cos();
    let y = r * angle.sin();
    (x, y)
}
let list = WengertList::new();
let r_record = Record::variable(1.0, &list);
let a_record = Record::variable(2.0, &list);
let (x_record, y_record) = cartesian(r_record, a_record);
// find dx/dr using reverse mode automatic differentiation
let x_derivatives = x_record.derivatives();
let dx_dr_reverse = x_derivatives[&r_record];
let (x_trace, y_trace) = cartesian(Trace::variable(1.0), Trace::constant(2.0));
// now find dx/dr with forward automatic differentiation
let dx_dr_forward = x_trace.derivative;
assert_eq!(dx_dr_reverse, dx_dr_forward);
let (x, y) = cartesian(1.0, 2.0);
assert_eq!(x, x_record.number); assert_eq!(x, x_trace.number);
assert_eq!(y, y_record.number); assert_eq!(y, y_trace.number);

§Equivalance

Although in this example the derivatives found are identical, in practise, because forward and reverse mode compute things differently and floating point numbers have limited precision, you should not expect the derivatives to be exactly equal.