Crate dfdx

source ·
Expand description


dfdx is a cuda accelerated tensor and neural network library, writtten entirely in rust!

Additionally, it can track compile time shapes across tensor operations, ensuring that all your neural networks are checked at compile time.

The following sections provide some high level core concepts & exmaples, and there is more detailed documentation in each of dfdx’s submodules.

See feature_flags for details on feature flags.

Shapes & Tensors

See dtypes, shapes, and tensor for more information.

At its core a tensor::Tensor is just a nd-array. Just like rust arrays there are two parts:

  1. Shape (shapes)
  2. Dtype (dtypes)

dfdx represents shapes as tuples of dimensions (shapes::Dim), where a dimension can either be known at:

  1. Compile time shapes::Const<M>
  2. Run time usize

You can freely mix and match these dimensions together. Here are some example shapes:

  • () - unit shape
  • (usize,) - 1d shape with a runtime known dimension
  • (usize, Const<5>) - 2d shape with both types of dimensions
  • (Const<3>, usize, Const<5>) - 3d shape!
  • Rank3<3, 5, 7> - Equivalent to (Const<3>, Const<5>, Const<7>)

Here are some comparisons between representing nd arrays in rust vs dfdx:

rust arraydfdx Tensor
f32Tensor<(), f32, …>
[u32; 5]Tensor<Rank1<5>, u32, …>
[[u8; 3]; 2]Tensor<Rank2<2, 3>, u8, …>
Vec<[bool; 5]>Tensor<(usize, Const<5>), bool, …>

The Rank1, Rank2 shapes used above are actually type aliases for when all dimensions are compile time:

Allocating tensors with Devices

See tensor for more information.

Devices are used to allocate tensors (and neural networks!). They are akin to std::alloc::GlobalAlloc in rust - they just allocate memory. They are also used to execute tensor ops, which we will get to later on.

There are two options for this currently, with more planned to be added in the future:

  1. tensor::Cpu - for tensors stored on the heap
  2. tensor::Cuda - for tensors stored in GPU memory

Both devices implement Default, you can also create them with a certain seed and ordinal.

Here’s how you might use a device:

let dev: Cpu = Default::default();
let t: Tensor<Rank2<2, 3>, f32, _> = dev.zeros();

Tensor Operations (tip of the iceberg)

See tensor_ops for more information

Once you’ve instantiated tensors with a device, you can start doing operations on them! There are many many operations, here are a few core ones and how they related to things like numpy/pytorch:

Unary Operationsa.sqrt()a.sqrt()a.sqrt()
Binary Operationsa + ba + ba + b
gemm/gemvtensor_ops::matmula @ ba @ b
2d Convolutiontensor_ops::TryConv2D-torch.conv2d
2d Transposed Convolutiontensor_ops::TryConvTrans2D-torch.conv_transpose2d

and much much more!

Neural networks

See nn for more information.

Neural networks are composed of building blocks that you can chain together. In dfdx, sequential neural networks are represents by tuples! For example, the following two networks are identical:

(Linear<3, 5>, ReLU, Linear<5, 10>)nn.Sequential(nn.Linear(3, 5), nn.ReLU(), nn.Linear(5, 10))
((Conv2D<3, 2, 1>, Tanh), Conv2D<3, 2, 1>)nn.Sequential(nn.Sequential(nn.Conv2d(3, 2, 1), nn.Tanh()), nn.Conv2d(3, 2, 1))

To build a neural network, you of course need a device:

let dev: Cpu = Default::default();
type Model = (Linear<3, 5>, ReLU, Linear<5, 10>);
let model = dev.build_module::<Model, f32>();

Note two things:

  1. We are using nn::DeviceBuildExt to instantiate the model
  2. We need to pass a dtype (in this case f32) to create the model.

You can then pass tensors into the model with nn::Module::forward():

// tensor with runtime batch dimension of 10
let x: Tensor<(usize, Const<3>), f32, _> = dev.sample_normal_like(&(10, Const));
let y = model.forward(x);

Optimizers and Gradients

See optim for more information

dfdx supports a number of the standard optimizers:

AdamWoptim::Adam with optim::WeightDecay::Decoupledtorch.optim.AdamW

You can use optimizers to optimize neural networks (or even tensors!). Here’s a simple example of how to do this with nn::ZeroGrads:

type Model = (Linear<3, 5>, ReLU, Linear<5, 10>);
let mut model = dev.build_module::<Model, f32>();
// 1. allocate gradients for the model
let mut grads = model.alloc_grads();
// 2. create our optimizer
let mut opt = Sgd::new(&model, Default::default());
// 3. trace gradients through forward pass
let x: Tensor<Rank2<10, 3>, f32, _> = dev.sample_normal();
let y = model.forward_mut(x.traced(grads));
// 4. compute loss & run backpropagation
let loss = y.square().mean();
grads = loss.backward();
// 5. apply gradients
opt.update(&mut model, &grads);