Struct dfdx::gradients::GradientTape

source · [−]

pub struct GradientTape { /* private fields */ }

Expand description

Records gradient computations to execute later.

The only two things you can do with this are:

Adding an operation (an operation is a FnOnce that acts on &mut Gradients)
Executing all the operations to produce Gradients

The reason for this design, which forces users to specify gradient computations, as opposed to having a fixed set of kinds of computations are these:

Different tensor sizes. The tensors size information would have to be stored inside the operation somehow. Instead, the operation themselves must query with a sized tensor, so sizes are still known at compile time instead of dynamically.
Slightly different operations. It’d have to support broadcasting operations, etc which can get needlessly complex.
Optimizations are harder. With operations having control over everything, they can be optimized by hand separately.

An example for how these two are used is the following from the negate operation (ie. multiply all values by -1).

tape.add_backward_op(move |grads| {
    let (t_grad, result_grad) = grads.mut_and_ref(&t, &_result);
    // addmul_assign is equivalent to: t_grad += t.data() * result_grad;
    T::Device::addmul(t_grad, t.data(), result_grad);
});

This is implementing the chain rule, which is normally defined as gradient(t) += deriv * gradient(result) with the following optimizations:

instead of allocating new data for the derivative (which is just -1 everywhere), we can reuse the t tensor since the negate function owns it.
We can combine computing the derivative and multiplying by the gradient(result) by just setting t to -gradient(result)

This would not be possible if these chain rule operations were inside of GradientTape!

Implementations

impl GradientTape

pub fn execute(self) -> Gradients

Compute the Gradients! This just runs all the operations on a new Gradients struct.

Note that this method takes ownership of self, so it can’t be called twice!

pub fn append(&mut self, other: &mut Self)

Moves all the operations from other into self. Leaves other empty.

Trait Implementations

impl Debug for GradientTape

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

impl Default for GradientTape

fn default() -> GradientTape

Returns the “default value” for a type. Read more

Auto Trait Implementations

impl !RefUnwindSafe for GradientTape

impl !Send for GradientTape

impl !Sync for GradientTape

impl Unpin for GradientTape

impl !UnwindSafe for GradientTape

Blanket Implementations

impl<T> Any for Twhere
T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for Twhere
T: ?Sized,

const: unstable · source

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for Twhere
T: ?Sized,

const: unstable · source

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> From<T> for T

const: unstable · source

fn from(t: T) -> T

Returns the argument unchanged.

impl<T, U> Into<U> for Twhere
U: From<T>,

const: unstable · source

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T, U> TryFrom<U> for Twhere
U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

const: unstable · source

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for Twhere
U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

const: unstable · source

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.

impl<V, T> VZip<V> for Twhere
V: MultiLane<T>,

fn vzip(self) -> V