Expand description

Mushin

Mushin is a pure Rust, no-unsafe library for computing gradients on dynamic computational graphs using reverse automatic differentiation. In other words, what PyTorch is to Python is what Mushin is to Rust.

All the operations on tensors use the excellent arrayfire library as a backend. Which means Mushin can perform computations on any device (Nvidia CUDA GPUs, OpenCL, Intel MKL… ). Plus, all operations are checked at compile time for mathematical correctness. I.e. You won’t be able to add two tensors of different shape/dimensions. The shape of the resulting tensors for all your operations is tracked through the computation graph so in that regard we can offer a guarantee that Tensorflow or PyTorch can’t: If it compiles, your computation graph is guaranteed to be correct

Usage

use mushin as mu;
use mu::Tensor;

let x = mu::eye::<1, 1, 2, 3>(3.0).freeze();
let w = mu::randn::<1, 1, 3, 2>();
let b = mu::fill::<1, 1, 3, 3>(0.0);

let z = w.mm(&x).add(&b);
z.backward();

let dz_dw = w.grad();
let dz_db = b.grad();

The code above is an example of a perceptron neural network layer, where we have an input (x) that we treat as a constant and a set of variable (trainable) parameters, (w,b). We then compute the output (z) as WX + b. All the operations are eagerly evaluated, so the resulting tensor values are available at any time. Comparted to lazy evaluation, this has the benefit that the built computation graph is trully dynamic, i.e. your graph operations can depend on the result of previous operations.

Mushin automatically keeps track of all the operations performed up until any given variable and calling backward() in one of them traverses the computation graph in reverse mode to accumulate the gradients of all of its ancestor variables. By using the grad() method in any of them we can now retrieve their gradients as new Variable tensor, which in turn can be used to compute further gradients!

It is quite possible the reader is more interested in the Deep Learning utilities of this library rather than the raw auto-grad foundations. By default, Mushin includes the nn module that provides optimizers, activation functions, layers and losses ready to use to build neural network modules. Checkout the module docs for instructions on how to use them.

Modules

This module exposes tooling for Deep Learning built upon the Mushin auto-grad core.

Traits

Defines operations on tensors, either Constant or Variable

Functions

Creates a Variable tensor from the given array of values

Creates a Variable tensor with the main diagonal filled with the given value, 0 everywhere else

Creates a Variable tensor filled with the given value

Creates a Variable tensor with random values taken from a normal distribution centered at 0

Creates a Variable tensor with random values taken from a uniform distribution between [0,1]