Crate dfdx

source ·
Expand description

Ergonomics & safety focused deep learning in Rust. Main features include:

  1. Tensor library with shapes up to 6d!
  2. Shapes with both compile and runtime sized dimensions. (e.g. Tensor<(usize, Const<10>)> and Tensor<Rank2<5, 10>>)
  3. A large library of tensor operations (including matmul, conv2d, and much more). a. All tensor operations shape and type checked at compile time!!
  4. Ergonomic neural network building blocks (like Linear, Conv2D, and Transformer).
  5. Standard deep learning optimizers such as Sgd, Adam, AdamW, RMSprop, and more.
  6. Reverse mode auto differentiation implementation.
  7. Serialization to/from .npy and .npz for transferring models to/from python.

A quick tutorial

  1. crate::tensor::Tensors can be created with normal rust arrays. See crate::tensor.
let dev: Cpu = Default::default();
let x = dev.tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]);
let y: Tensor<Rank2<2, 3>, f32, Cpu> = dev.ones();
// Runtime shape
let z: Tensor<(usize, Const<3>), f32, _> = dev.ones_like(&(10, Const));
  1. Neural networks are built with types. Tuples are sequential models. See crate::nn.
type Mlp = (
    Linear<5, 3>,
    Linear<3, 2>,
  1. Instantiate models with crate::nn::DeviceBuildExt
let dev: Cpu = Default::default();
type Model = (Linear<5, 2>, ReLU);
let mlp = dev.build_module::<Model, f32>();
  1. Pass data through networks with crate::nn::Module
let x: Tensor<Rank1<5>, f32, _> = dev.zeros();
let y = mlp.forward(x); // compiler infers that `y` must be `Tensor<Rank1<2>>`
  1. Trace gradients using crate::tensor::Trace::trace()
// allocate gradients [ZeroGrads::alloc_grads]
let grads = mlp.alloc_grads();

// tensors default to not having a tape
let x: Tensor<Rank1<10>, f32, Cpu, NoneTape> = dev.zeros();

// `.trace()` clones `x` and inserts a gradient tape.
let x_traced: Tensor<Rank1<10>, f32, Cpu, OwnedTape<f32, Cpu>> = x.trace(grads);

// The tape from the input is moved through the network during .forward().
let y: Tensor<Rank1<5>, f32, Cpu, NoneTape> = mlp.forward(x);
let y_traced: Tensor<Rank1<5>, f32, Cpu, OwnedTape<f32, Cpu>> = mlp.forward(x_traced);
  1. Compute gradients with crate::tensor_ops::Backward. See crate::tensor_ops.
// compute cross entropy loss
let loss = cross_entropy_with_logits_loss(y, y_true);

// call `backward()` to compute gradients. The tensor *must* have `OwnedTape`!
let gradients: Gradients<f32, Cpu> = loss.backward();
  1. Use an optimizer from crate::optim to optimize your network!
// Use stochastic gradient descent (Sgd), with a learning rate of 1e-2, and 0.9 momentum.
let mut opt = Sgd::new(&mlp, SgdConfig {
    lr: 1e-2,
    momentum: Some(Momentum::Classic(0.9)),
    weight_decay: None,

// pass the gradients & the mlp into the optimizer's update method
opt.update(&mut mlp, &gradients);
mlp.zero_grads(&mut gradients);