Module dfdx::nn

Expand description

High level neural network building blocks such as Linear, activations, and tuples as Modules. Also includes .save() & .load() for all Modules.

Initializing

All modules implement Default, and this initializes all parameters to 0.0. The intention is then to call ResetParams::reset_params(), which randomizes the parameters:

let mut model: Linear<5, 2> = Default::default(); // set all params to 0
model.reset_params(&mut rng); // randomize weights

Sequential models

Tuple’s implement Module, so you can string multiple module’s together.

Here’s a single layer MLP:

type Mlp = (Linear<5, 3>, ReLU, Linear<3, 2>);

Here’s a more complex feedforward network that takes vectors of 5 elements and maps them to 2 elements.

type ComplexNetwork = (
    DropoutOneIn<2>, // 1. dropout 50% of input
    Linear<5, 3>,    // 2. pass into a linear layer
    LayerNorm1D<3>,  // 3. normalize elements
    ReLU,            // 4. activate with relu
    Residual<(       // 5. residual connection that adds input to the result of it's sub layers
        Linear<3, 3>,// 5.a. Apply linear layer
        ReLU,        // 5.b. Apply Relu
    )>,              // 5.c. the input to the residual is added back in after the sub layers
    Linear<3, 2>,    // 6. Apply another linear layer
);

Call SaveToNpz::save() and LoadFromNpz::load() traits. All modules provided here implement it, including tuples. These all save to/from .npz files, which are basically zip files with multiple .npy files.

This is implemented to be fairly portable. For example you can load a simple MLP into pytorch like so:

import torch
import numpy as np
state_dict = {k: torch.from_numpy(v) for k, v in np.load("dfdx-model.npz").items()}
mlp.load_state_dict(state_dict)

Structs

Abs

Unit struct that impls Module as calling abs() on input.

Cos

Unit struct that impls Module as calling cos() on input.

Dropout

A Module that calls dropout() in Module::forward() with probability self.p.

DropoutOneIn

A Module that calls dropout() in Module::forward() with probability 1.0 / N. Note that dropout() does not do anything for tensors with NoneTape.

Exp

Unit struct that impls Module as calling exp() on input.

LayerNorm1D

Implements layer normalization as described in Layer Normalization.

Linear

A linear transformation of the form weight * x + bias, where weight is a matrix, x is a vector or matrix, and bias is a vector.

Ln

Unit struct that impls Module as calling ln() on input.

ReLU

Unit struct that impls Module as calling relu() on input.

Repeated

Repeats T N times. This requires that T’s input is the same as it’s output.

Residual

A residual connection around F: F(x) + x, as introduced in Deep Residual Learning for Image Recognition.

Sigmoid

Unit struct that impls Module as calling sigmoid() on input.

Sin

Unit struct that impls Module as calling sin() on input.

SplitInto

Splits input into multiple heads. T should be a tuple, where every element of the tuple accepts the same input type.

Sqrt

Unit struct that impls Module as calling sqrt() on input.

Square

Unit struct that impls Module as calling square() on input.

Tanh

Unit struct that impls Module as calling tanh() on input.

Enums

NpzError

Error that can happen while loading data from a .npz zip archive.

Traits

LoadFromNpz

Something that can be loaded from a .npz file (which is a zip file).

Module

A unit of a neural network. Acts on the generic Input and produces Module::Output.

ResetParams

Something that can reset it’s parameters.

SaveToNpz

Something that can be saved to a .npz (which is a .zip).

Functions

npz_fread

Reads data from a file already in a zip archive named filename.

npz_fwrite

Writes data to a new file in a zip archive named filename.