Crate dfdx

Expand description

dfdx

dfdx is a cuda accelerated tensor and neural network library, writtten entirely in rust!

Additionally, it can track compile time shapes across tensor operations, ensuring that all your neural networks are checked at compile time.

The following sections provide some high level core concepts & exmaples, and there is more detailed documentation in each of dfdx’s submodules.

See feature_flags for details on feature flags.

Shapes & Tensors

See dtypes, shapes, and tensor for more information.

At its core a tensor::Tensor is just a nd-array. Just like rust arrays there are two parts:

Shape (shapes)
Dtype (dtypes)

dfdx represents shapes as tuples of dimensions (shapes::Dim), where a dimension can either be known at:

Compile time shapes::Const<M>
Run time usize

You can freely mix and match these dimensions together. Here are some example shapes:

() - unit shape
(usize,) - 1d shape with a runtime known dimension
(usize, Const<5>) - 2d shape with both types of dimensions
(Const<3>, usize, Const<5>) - 3d shape!
Rank3<3, 5, 7> - Equivalent to (Const<3>, Const<5>, Const<7>)

Here are some comparisons between representing nd arrays in rust vs dfdx:

rust array	dfdx `Tensor`
f32	Tensor<(), f32, …>
[u32; 5]	Tensor<Rank1<5>, u32, …>
[[u8; 3]; 2]	Tensor<Rank2<2, 3>, u8, …>
Vec<[bool; 5]>	Tensor<(usize, Const<5>), bool, …>

The Rank1, Rank2 shapes used above are actually type aliases for when all dimensions are compile time:

shapes::Rank0 is just ().
shapes::Rank1<M> is (Const<M>, )
shapes::Rank2<M, N> is (Const<M>, Const<N>)

Allocating tensors with Devices

See tensor for more information.

Devices are used to allocate tensors (and neural networks!). They are akin to std::alloc::GlobalAlloc in rust - they just allocate memory. They are also used to execute tensor ops, which we will get to later on.

There are two options for this currently, with more planned to be added in the future:

tensor::Cpu - for tensors stored on the heap
tensor::Cuda - for tensors stored in GPU memory

Both devices implement Default, you can also create them with a certain seed and ordinal.

Here’s how you might use a device:

let dev: Cpu = Default::default();
let t: Tensor<Rank2<2, 3>, f32, _> = dev.zeros();

Tensor Operations (tip of the iceberg)

See tensor_ops for more information

Once you’ve instantiated tensors with a device, you can start doing operations on them! There are many many operations, here are a few core ones and how they related to things like numpy/pytorch:

Operation	dfdx	numpy	pytorch
Unary Operations	`a.sqrt()`	`a.sqrt()`	`a.sqrt()`
Binary Operations	`a + b`	`a + b`	`a + b`
gemm/gemv	tensor_ops::matmul	`a @ b`	`a @ b`
2d Convolution	tensor_ops::TryConv2D	-	`torch.conv2d`
2d Transposed Convolution	tensor_ops::TryConvTrans2D	-	`torch.conv_transpose2d`
Slicing	tensor_ops::slice	`a[...]`	`a[...]`
Select	tensor_ops::SelectTo	`a[...]`	`torch.select`
Gather	tensor_ops::GatherTo	`np.take`	`torch.gather`
Broadcasting	tensor_ops::BroadcastTo	implicit/`np.broadcast`	implicit/`torch.broadcast_to`
Permute	tensor_ops::PermuteTo	`np.transpose(...)`	`torch.permute`
Where	tensor_ops::ChooseFrom	`np.where`	`torch.where`
Reshape	tensor_ops::ReshapeTo	`np.reshape(shape)`	`a.reshape(shape)`
View	tensor_ops::ReshapeTo	`np.view(...)`	`a.view(...)`
Roll	tensor_ops::Roll	`np.rollaxis(...)`	`a.roll(...)`
Stack	tensor_ops::TryStack	`np.stack`	`torch.stack`
Concat	tensor_ops::TryConcat	`np.concatenate`	`torch.concat`

and much much more!

Neural networks

See nn for more information.

Neural networks are composed of building blocks that you can chain together. In dfdx, sequential neural networks are represents by tuples! For example, the following two networks are identical:

dfdx	pytorch
`(Linear<3, 5>, ReLU, Linear<5, 10>)`	`nn.Sequential(nn.Linear(3, 5), nn.ReLU(), nn.Linear(5, 10))`
`((Conv2D<3, 2, 1>, Tanh), Conv2D<3, 2, 1>)`	`nn.Sequential(nn.Sequential(nn.Conv2d(3, 2, 1), nn.Tanh()), nn.Conv2d(3, 2, 1))`

To build a neural network, you of course need a device:

let dev: Cpu = Default::default();
type Model = (Linear<3, 5>, ReLU, Linear<5, 10>);
let model = dev.build_module::<Model, f32>();

Note two things:

We are using nn::DeviceBuildExt to instantiate the model
We need to pass a dtype (in this case f32) to create the model.

You can then pass tensors into the model with nn::Module::forward():

// tensor with runtime batch dimension of 10
let x: Tensor<(usize, Const<3>), f32, _> = dev.sample_normal_like(&(10, Const));
let y = model.forward(x);

Optimizers and Gradients

See optim for more information

dfdx supports a number of the standard optimizers:

Optimizer	dfdx	pytorch
SGD	optim::Sgd	`torch.optim.SGD`
Adam	optim::Adam	torch.optim.Adam`
AdamW	optim::Adam with optim::WeightDecay::Decoupled	`torch.optim.AdamW`
RMSprop	optim::RMSprop	`torch.optim.RMSprop`

You can use optimizers to optimize neural networks (or even tensors!). Here’s a simple example of how to do this with nn::ZeroGrads:

type Model = (Linear<3, 5>, ReLU, Linear<5, 10>);
let mut model = dev.build_module::<Model, f32>();
// 1. allocate gradients for the model
let mut grads = model.alloc_grads();
// 2. create our optimizer
let mut opt = Sgd::new(&model, Default::default());
// 3. trace gradients through forward pass
let x: Tensor<Rank2<10, 3>, f32, _> = dev.sample_normal();
let y = model.forward_mut(x.traced(grads));
// 4. compute loss & run backpropagation
let loss = y.square().mean();
grads = loss.backward();
// 5. apply gradients
opt.update(&mut model, &grads);

Modules

data
A collection of useful data utilities such as ExactSizeDataset, OneHotEncode, Arange, and iterator extension traits!
dtypes
Module for data type related traits and structs. Contains things like Unit, Dtype, and AMP.
feature_flags
Information about the available feature flags.
losses
Standard loss functions such as mse_loss(), cross_entropy_with_logits_loss(), and more.
nn
High level neural network building blocks such as modules::Linear, activations, and tuples as Modules. Also includes .save() & .load() for all Modules.
optim
Optimizers such as Sgd, Adam, and RMSprop that can optimize neural networks.
prelude
Contains subset of all public exports.
shapes
Shape related traits/structes like Shape, Dtype, Dim, Axis, and Const
tensor
The Tensor struct, Cpu & Cuda devices, and traits like ZerosTensor, OnesTensor, SampleTensor.
tensor_ops
Operations on tensors like relu(), matmul(), softmax(), and more.

Functions

flush_denormals_to_zero
Sets a CPU sse flag to flush denormal floating point numbers to zero. The opposite of this is keep_denormals().
keep_denormals
Sets a CPU flag to keep denormal floating point numbers. The opposite of this is flush_denormals_to_zero().