Expand description
Mushin
Mushin is a pure Rust
, no-unsafe library for computing gradients on dynamic
computational graphs using
reverse automatic differentiation.
In other words, what PyTorch
is to Python
is what Mushin
is to Rust
.
All the operations on tensors use the excellent arrayfire
library as a backend. Which means Mushin can perform computations on any device
(Nvidia CUDA GPUs, OpenCL
, Intel MKL… ). Plus, all operations are checked at
compile time for mathematical correctness. I.e. You won’t be able to add two tensors
of different shape/dimensions. The shape of the resulting tensors for all your
operations is tracked through the computation graph so in that regard we can offer
a guarantee that Tensorflow
or PyTorch
can’t: If it compiles, your computation
graph is guaranteed to be correct
Usage
use mushin as mu;
use mu::Tensor;
let x = mu::eye::<1, 1, 2, 3>(3.0).freeze();
let w = mu::randn::<1, 1, 3, 2>();
let b = mu::fill::<1, 1, 3, 3>(0.0);
let z = w.mm(&x).add(&b);
z.backward();
let dz_dw = w.grad();
let dz_db = b.grad();
The code above is an example of a perceptron neural network layer, where we have an input (x
)
that we treat as a constant and a set of variable (trainable) parameters, (w
,b
).
We then compute the output (z
) as WX + b
. All the operations are eagerly evaluated, so the
resulting tensor values are available at any time. Comparted to lazy evaluation, this has the
benefit that the built computation graph is trully dynamic, i.e. your graph operations can depend
on the result of previous operations.
Mushin automatically keeps track of all the operations performed up until any given variable
and calling backward()
in one of them traverses the computation graph in
reverse mode to accumulate the
gradients of all of its ancestor variables. By using the grad()
method in any of them we can
now retrieve their gradients as new Variable
tensor, which in turn can be used to compute
further gradients!
It is quite possible the reader is more interested in the Deep Learning utilities of this library rather than the raw auto-grad foundations. By default, Mushin includes the nn module that provides optimizers, activation functions, layers and losses ready to use to build neural network modules. Checkout the module docs for instructions on how to use them.
Modules
This module exposes tooling for Deep Learning built upon the Mushin auto-grad core.
Traits
Defines operations on tensors, either Constant
or Variable
Functions
Creates a Variable
tensor from the given array of values
Creates a Variable
tensor with the main diagonal filled with the given value, 0 everywhere else
Creates a Variable
tensor filled with the given value
Creates a Variable
tensor with random values taken from a normal distribution centered at 0
Creates a Variable
tensor with random values taken from a uniform distribution between [0,1]