Crate dfdx

Expand description

Ergonomics & safety focused deep learning in Rust. Main features include:

Tensor library with shapes up to 6d!
Shapes with both compile and runtime sized dimensions. (e.g. Tensor<(usize, Const<10>)> and Tensor<Rank2<5, 10>>)
A large library of tensor operations (including matmul, conv2d, and much more). a. All tensor operations shape and type checked at compile time!!
Ergonomic neural network building blocks (like Linear, Conv2D, and Transformer).
Standard deep learning optimizers such as Sgd, Adam, AdamW, RMSprop, and more.
Reverse mode auto differentiation implementation.
Serialization to/from .npy and .npz for transferring models to/from python.

A quick tutorial

crate::tensor::Tensors can be created with normal rust arrays. See crate::tensor.

let dev: Cpu = Default::default();
let x = dev.tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]);
let y: Tensor<Rank2<2, 3>, f32, Cpu> = dev.ones();
// Runtime shape
let z: Tensor<(usize, Const<3>), f32, _> = dev.ones_like(&(10, Const));

Neural networks are built with types. Tuples are sequential models. See crate::nn.

type Mlp = (
    Linear<5, 3>,
    ReLU,
    Linear<3, 2>,
);

Instantiate models with crate::nn::DeviceBuildExt

let dev: Cpu = Default::default();
type Model = (Linear<5, 2>, ReLU);
let mlp = dev.build_module::<Model, f32>();

Pass data through networks with crate::nn::Module

let x: Tensor<Rank1<5>, f32, _> = dev.zeros();
let y = mlp.forward(x); // compiler infers that `y` must be `Tensor<Rank1<2>>`

Trace gradients using crate::tensor::Trace::trace()

// allocate gradients [ZeroGrads::alloc_grads]
let grads = mlp.alloc_grads();

// tensors default to not having a tape
let x: Tensor<Rank1<10>, f32, Cpu, NoneTape> = dev.zeros();

// `.trace()` clones `x` and inserts a gradient tape.
let x_traced: Tensor<Rank1<10>, f32, Cpu, OwnedTape<f32, Cpu>> = x.trace(grads);

// The tape from the input is moved through the network during .forward().
let y: Tensor<Rank1<5>, f32, Cpu, NoneTape> = mlp.forward(x);
let y_traced: Tensor<Rank1<5>, f32, Cpu, OwnedTape<f32, Cpu>> = mlp.forward(x_traced);

Compute gradients with crate::tensor_ops::Backward. See crate::tensor_ops.

// compute cross entropy loss
let loss = cross_entropy_with_logits_loss(y, y_true);

// call `backward()` to compute gradients. The tensor *must* have `OwnedTape`!
let gradients: Gradients<f32, Cpu> = loss.backward();

Use an optimizer from crate::optim to optimize your network!

// Use stochastic gradient descent (Sgd), with a learning rate of 1e-2, and 0.9 momentum.
let mut opt = Sgd::new(&mlp, SgdConfig {
    lr: 1e-2,
    momentum: Some(Momentum::Classic(0.9)),
    weight_decay: None,
});

// pass the gradients & the mlp into the optimizer's update method
opt.update(&mut mlp, &gradients);
mlp.zero_grads(&mut gradients);

Modules

data
A collection of useful data utilities such as ExactSizeDataset, OneHotEncode, Arange, and iterator extension traits!
feature_flags
Information about the available feature flags.
losses
Standard loss functions such as mse_loss(), cross_entropy_with_logits_loss(), and more.
nn
High level neural network building blocks such as modules::Linear, activations, and tuples as Modules. Also includes .save() & .load() for all Modules.
optim
Optimizers such as Sgd, Adam, and RMSprop that can optimize neural networks.
prelude
Contains subset of all public exports.
shapes
Shape related traits/structes like Shape, Dtype, Dim, Axis, and Const
tensor
The Tensor struct, Cpu & Cuda devices, and traits like ZerosTensor, OnesTensor, SampleTensor.
tensor_ops
Operations on tensors like relu(), matmul(), softmax(), and more.

Functions

flush_denormals_to_zero
Sets a CPU sse flag to flush denormal floating point numbers to zero. The opposite of this is keep_denormals().
keep_denormals
Sets a CPU flag to keep denormal floating point numbers. The opposite of this is flush_denormals_to_zero().