easy_ml/differentiation/
usage.rs

1/*!
2 * # Usage of Record and Trace
3 *
4 * Both `Trace` and `Record` for forward and reverse automatic differentiation respectively
5 * implement `Numeric` and can generally be treated as normal numbers just like `f32` and `f64`.
6 *
7 * `Trace` is literally implemented as a dual number, and is more or less a one to one
8 * substitution. `Record` requires dynamically building a computational graph of the values
9 * and dependencies of each operation performed on them. This means performing operations on
10 * records have side effects, they add entries onto a `WengertList`. However, when using
11 * `Record` the side effects are abstracted away, just create a `WengertList` before you
12 * start creating Records.
13 *
14 * Given some function from N inputs to M outputs you can pass it `Trace`s or `Record`s
15 * and retrieve the first derivative from the outputs for all combinations of N and M.
16 * If N >> M then you should use `Record` as reverse mode automatic differentiation is
17 * much cheaper. If N << M then you should use `Trace` as it will be much cheaper. If
18 * you have large N and M, or small N and M, you might have to benchmark to find which
19 * method works best. However, most problems are N > M.
20 *
21 * For this example we use a function which takes two inputs, r and a, and returns two
22 * outputs, x and y.
23 *
24 * ## Using Trace
25 *
26 * ```
27 * use easy_ml::differentiation::Trace;
28 * use easy_ml::numeric::extra::Cos;
29 * use easy_ml::numeric::extra::Sin;
30 * fn cartesian(r: Trace<f32>, angle: Trace<f32>) -> (Trace<f32>, Trace<f32>) {
31 *     let x = r * angle.cos();
32 *     let y = r * angle.sin();
33 *     (x, y)
34 * }
35 * // first find dx/dr and dy/dr
36 * let (x, y) = cartesian(Trace::variable(1.0), Trace::constant(2.0));
37 * let dx_dr = x.derivative;
38 * let dy_dr = y.derivative;
39 * // now find dx/da and dy/da
40 * let (x, y) = cartesian(Trace::constant(1.0), Trace::variable(2.0));
41 * let dx_da = x.derivative;
42 * let dy_da = y.derivative;
43 * ```
44 *
45 * ## Using Record
46 *
47 * ```
48 * use easy_ml::differentiation::{Record, WengertList};
49 * use easy_ml::numeric::extra::{Cos, Sin};
50 * // the lifetimes tell the rust compiler that our inputs and outputs
51 * // can all live as long as the WengertList
52 * fn cartesian<'a>(
53 *     r: Record<'a, f32>,
54 *     angle: Record<'a, f32>
55 * ) -> (Record<'a, f32>, Record<'a, f32>) {
56 *     let x = r * angle.cos();
57 *     let y = r * angle.sin();
58 *     (x, y)
59 * }
60 * // first we must construct a WengertList to create records from
61 * let list = WengertList::new();
62 * let r = Record::variable(1.0, &list);
63 * let a = Record::variable(2.0, &list);
64 * let (x, y) = cartesian(r, a);
65 * // first find dx/dr and dx/da
66 * let x_derivatives = x.derivatives();
67 * let dx_dr = x_derivatives[&r];
68 * let dx_da = x_derivatives[&a];
69 * // now find dy/dr and dy/da
70 * let y_derivatives = y.derivatives();
71 * let dy_dr = y_derivatives[&r];
72 * let dy_da = y_derivatives[&a];
73 * ```
74 *
75 * ## Using Record container
76 * ```
77 * use easy_ml::differentiation::{Record, RecordTensor, WengertList};
78 * use easy_ml::numeric::extra::{Cos, Sin};
79 * use easy_ml::tensors::Tensor;
80 *
81 * // the lifetimes tell the rust compiler that our inputs and outputs
82 * // can all live as long as the WengertList
83 * fn cartesian<'a>(
84 *     r: Record<'a, f32>,
85 *     angle: Record<'a, f32>
86 * ) -> [Record<'a, f32>; 2] {
87 *     let x = r * angle.cos();
88 *     let y = r * angle.sin();
89 *     [x, y]
90 * }
91 * // first we must construct a WengertList to create records from
92 * let list = WengertList::new();
93 * // for this example we also calculate derivatives for 1.5 and 2.5 since you wouldn't use
94 * // RecordTensor if you only had a single variable input
95 * let R = RecordTensor::variables(&list, Tensor::from([("radius", 2)], vec![ 1.0, 1.5 ]));
96 * let A = RecordTensor::variables(&list, Tensor::from([("angle", 2)], vec![ 2.0, 2.5 ]));
97 * let (X, Y) = {
98 *     let [resultX, resultY] = RecordTensor::from_iters(
99 *         [("z", 2)],
100 *         R.iter_as_records()
101 *             .zip(A.iter_as_records())
102 *             // here we operate on each pair of Records as in the prior example, except
103 *             // we have to convert to arrays to stream the collection back into RecordTensors
104 *             .map(|(r, a)| cartesian(r, a))
105 *     );
106 *     // we know we can unwrap for this example because we know each iterator contained 2
107 *     // elements which matches the shape we're converting back to
108 *     (resultX.unwrap(), resultY.unwrap())
109 * };
110 * // first find dX/dR and dX/dA, we can unwrap because we know X and Y are variables rather than
111 * // constants.
112 * let X_derivatives = X.derivatives().unwrap();
113 * let dX_dR = X_derivatives.map(|d| d.at_tensor(&R));
114 * let dX_dA = X_derivatives.map(|d| d.at_tensor(&A));
115 * // now find dY/dR and dY/dA
116 * let Y_derivatives = Y.derivatives().unwrap();
117 * let dY_dR = Y_derivatives.map(|d| d.at_tensor(&R));
118 * let dY_dA = Y_derivatives.map(|d| d.at_tensor(&A));
119 * ```
120 *
121 * ## Differences
122 *
123 * Notice how in the above examples all the same 4 derivatives are found, but in
124 * forward mode we rerun the function with a different input as the sole variable,
125 * the rest as constants, whereas in reverse mode we rerun the `derivatives()` function
126 * on a different output variable. With Reverse mode we would only pass constants into
127 * the `cartesian` function if we didn't want to get their derivatives (and avoid wasting
128 * memory on something we didn't need).
129 *
130 * Storing matrices, tensors or vecs of Records can be inefficienct as it stores the history
131 * for each record even when they are the same. Instead, a
132 * [RecordTensor](crate::differentiation::RecordTensor) or
133 * [RecordMatrix](crate::differentiation::RecordMatrix) can be used,
134 * either directly with their elementwise APIs and trait implementations or manipulated as an
135 * iterator of Records then collected back into a RecordTensor or RecordMatrix with
136 * [RecordContainer::from_iter](crate::differentiation::RecordContainer::from_iter)
137 * and [RecordContainer::from_iters](crate::differentiation::RecordContainer::from_iters)
138 *
139 * ## Substitution
140 *
141 * There is no need to rewrite the input functions, as you can use the `Numeric` and `Real`
142 * traits to write a function that will take floating point numbers, `Trace`s and `Record`s.
143 *
144 * ```
145 * use easy_ml::differentiation::{Trace, Record, WengertList};
146 * use crate::easy_ml::numeric::extra::{Real};
147 * fn cartesian<T: Real + Copy>(r: T, angle: T) -> (T, T) {
148 *     let x = r * angle.cos();
149 *     let y = r * angle.sin();
150 *     (x, y)
151 * }
152 * let list = WengertList::new();
153 * let r_record = Record::variable(1.0, &list);
154 * let a_record = Record::variable(2.0, &list);
155 * let (x_record, y_record) = cartesian(r_record, a_record);
156 * // find dx/dr using reverse mode automatic differentiation
157 * let x_derivatives = x_record.derivatives();
158 * let dx_dr_reverse = x_derivatives[&r_record];
159 * let (x_trace, y_trace) = cartesian(Trace::variable(1.0), Trace::constant(2.0));
160 * // now find dx/dr with forward automatic differentiation
161 * let dx_dr_forward = x_trace.derivative;
162 * assert_eq!(dx_dr_reverse, dx_dr_forward);
163 * let (x, y) = cartesian(1.0, 2.0);
164 * assert_eq!(x, x_record.number); assert_eq!(x, x_trace.number);
165 * assert_eq!(y, y_record.number); assert_eq!(y, y_trace.number);
166 * ```
167 *
168 * ## Equivalance
169 *
170 * Although in this example the derivatives found are identical, in practise, because
171 * forward and reverse mode compute things differently and floating point numbers have
172 * limited precision, you should not expect the derivatives to be exactly equal.
173 */