Expand description
High level neural network building blocks such as Linear, activations, and tuples as Modules.
Also includes .save()
& .load()
for all Modules.
Initializing
All modules implement Default, and this initializes all parameters to 0.0
. The intention is then
to call ResetParams::reset_params(), which randomizes the parameters:
let mut model: Linear<5, 2> = Default::default(); // set all params to 0
model.reset_params(&mut rng); // randomize weights
Sequential models
Tuple’s implement Module, so you can string multiple module’s together.
Here’s a single layer MLP:
type Mlp = (Linear<5, 3>, ReLU, Linear<3, 2>);
Here’s a more complex feedforward network that takes vectors of 5 elements and maps them to 2 elements.
type ComplexNetwork = (
DropoutOneIn<2>, // 1. dropout 50% of input
Linear<5, 3>, // 2. pass into a linear layer
LayerNorm1D<3>, // 3. normalize elements
ReLU, // 4. activate with relu
Residual<( // 5. residual connection that adds input to the result of it's sub layers
Linear<3, 3>,// 5.a. Apply linear layer
ReLU, // 5.b. Apply Relu
)>, // 5.c. the input to the residual is added back in after the sub layers
Linear<3, 2>, // 6. Apply another linear layer
);
Saving and Loading
Call SaveToNpz::save() and LoadFromNpz::load() traits. All modules provided here implement it,
including tuples. These all save to/from .npz
files, which are basically zip files with multiple .npy
files.
This is implemented to be fairly portable. For example you can load a simple MLP into pytorch like so:
import torch
import numpy as np
state_dict = {k: torch.from_numpy(v) for k, v in np.load("dfdx-model.npz").items()}
mlp.load_state_dict(state_dict)
Structs
A Moduleself.p
.
A Module1.0 / N
.
Note that dropout() does not do anything for tensors with NoneTape.
Implements layer normalization as described in Layer Normalization.
A linear transformation of the form weight * x + bias
, where weight
is a matrix, x
is a vector or matrix,
and bias
is a vector.
Repeats T
N
times. This requires that T
’s input is the same as it’s output.
A residual connection around F
: F(x) + x
,
as introduced in Deep Residual Learning for Image Recognition.
Splits input into multiple heads. T
should be a tuple,
where every element of the tuple accepts the same input type.
Enums
Error that can happen while loading data from a .npz
zip archive.
Traits
Something that can be loaded from a .npz
file (which is a zip
file).
A unit of a neural network. Acts on the generic Input
and produces Module::Output
.
Something that can reset it’s parameters.
Something that can be saved to a .npz
(which is a .zip
).
Functions
Reads data
from a file already in a zip archive named filename
.
Writes data
to a new file in a zip archive named filename
.