Expand description
High level neural network building blocks such as Linear, activations, and tuples as Modules.
Also includes .save()
& .load()
for all Modules.
Mutable vs Immutable forwards
This is provided as two separate traits
- ModuleMut::forward_mut() which receives
&mut self
. - Module::forward() which receives
&self
.
This has nothing to do with whether gradients are being tracked or not. It only controls whether the module itself can be modified. Both OwnedTape and NoneTape can still be passed to both, and all modules should conform to this expected behavior.
In general, ModuleMut::forward_mut() should be used during training, and Module::forward() during evaluation/testing/inference/validation.
Here is a list of existing modules that have different behavior in these two functions:
Initializing
All modules implement Default, and this initializes all parameters to 0.0
. The intention is then
to call ResetParams::reset_params(), which randomizes the parameters:
let mut model: Linear<5, 2> = Default::default(); // set all params to 0
model.reset_params(&mut rng); // randomize weights
Sequential models
Tuple’s implement Module, so you can string multiple module’s together.
Here’s a single layer MLP:
type Mlp = (Linear<5, 3>, ReLU, Linear<3, 2>);
Here’s a more complex feedforward network that takes vectors of 5 elements and maps them to 2 elements.
type ComplexNetwork = (
DropoutOneIn<2>, // 1. dropout 50% of input
Linear<5, 3>, // 2. pass into a linear layer
LayerNorm1D<3>, // 3. normalize elements
ReLU, // 4. activate with relu
Residual<( // 5. residual connection that adds input to the result of it's sub layers
Linear<3, 3>,// 5.a. Apply linear layer
ReLU, // 5.b. Apply Relu
)>, // 5.c. the input to the residual is added back in after the sub layers
Linear<3, 2>, // 6. Apply another linear layer
);
Saving and Loading
Call SaveToNpz::save() and LoadFromNpz::load() traits. All modules provided here implement it,
including tuples. These all save to/from .npz
files, which are basically zip files with multiple .npy
files.
This is implemented to be fairly portable. For example you can load a simple MLP into pytorch like so:
import torch
import numpy as np
state_dict = {k: torch.from_numpy(v) for k, v in np.load("dfdx-model.npz").items()}
mlp.load_state_dict(state_dict)
Structs
R
around F
: F(x) + R(x)
,
as introduced in Deep Residual Learning for Image Recognition.weight * x + bias
, where weight
is a matrix, x
is a vector or matrix,
and bias
is a vector.T
N
times. This requires that T
’s input is the same as it’s output.F
: F(x) + x
,
as introduced in Deep Residual Learning for Image Recognition.T
should be a tuple,
where every element of the tuple accepts the same input type.Enums
.npz
zip archive.Traits
.npz
file (which is a zip
file)..npz
(which is a .zip
).Functions
data
from a file already in a zip archive named filename
.data
to a new file in a zip archive named filename
.