Module neuronika::nn[−][src]

Expand description

Basic building blocks for neural networks.

Neuronika provides some pre-assembled components, you can either use them individually or combine them into a bigger architecture. Take a look at the complete list to know more.

You can also customize the initialization of the parameters of such components, and that of any other differentiable variable, by picking the function that best fits your needs from the nn::init module.

Refer to the nn::loss module for loss functions.

Assembling a neural network

The suggested way of building a model using neuronika’s building blocks is to define a struct encapsulating its components.

The behavior of the model should be defined by including an appropriate method in its struct implementation. Such method must specify how the components interact.

Consider, for the sake of simplicity, a classical multilayer perceptron with three dense layers for a multivariate regression task, let’s see what it would look like in neuronika.

We begin by defining its struct using the provided components.

use neuronika::nn;

// Network definition.
struct NeuralNetwork {
    lin1: nn::Linear,
    lin2: nn::Linear,
    lin3: nn::Linear,     
}

We’ll also include a very simple constructor.

impl NeuralNetwork {
    // Basic constructor.
    fn new() -> Self {
        Self {
            lin1: nn::Linear::new(25, 30),
            lin2: nn::Linear::new(30, 35),
            lin3: nn::Linear::new(35, 5),
        }
    }
}

As the last step, we have to specify how the multilayer perceptron behaves, then, we’re done.

use ndarray::Ix2;
use neuronika::{Backward, Data, Forward, Gradient, MatMatMulT, Overwrite, VarDiff};
use neuronika::nn::Learnable;

impl NeuralNetwork {
    // NeuralNetwork behavior. Notice the presence of the ReLU non-linearity.
    fn forward<I, T, U>(
        &self,
        input: I,
    ) -> VarDiff<
            impl Data<Dim = Ix2> + Forward,
            impl Gradient<Dim = Ix2> + Overwrite + Backward
        >
    where
        I: MatMatMulT<Learnable<Ix2>>,
        I::Output: Into<VarDiff<T, U>>,
        T: Data<Dim = Ix2> + Forward,
        U: Gradient<Dim = Ix2> + Backward + Overwrite,
    {
        let out1 = self.lin1.forward(input).relu();
        let out2 = self.lin2.forward(out1).relu();
        let out3 = self.lin3.forward(out2);
        out3
    }
}

Here’s a fictitious example of the newly created multilayer perceptron in use.

let model = NeuralNetwork::new();

// Random data to be given in input to the model.
let fictitious_data = neuronika::rand((200, 25));

let out = model.forward(fictitious_data);
out.forward(); // Always remember to call forward() !

Tracking parameters with ModelStatus

In some circumstances you may find useful to group the parameters of a model. Consider for instance the following scenario.

let model = NeuralNetwork::new();

let some_other_variable = neuronika::rand((1, 25)).requires_grad();

// Random perturbed data.
let fictitious_data = neuronika::rand((200, 25)) + some_other_variable;

let out = model.forward(fictitious_data);
assert_eq!(out.parameters().len(), 7); // 7 leaf ancestors !

You may notice how, if we feed in input to our neural network the result of an addition operation, in which one of the operands is a differentiable variable, and then request the network output’s differentiable ancestors, we are given a vector containing 7 Param.

By doing some quick math: 7 = 2 * 3 + 1, and by noticing that each of the three linear layers that the multilayer perceptron is made of has one learnable weight matrix and one learnable bias vector, we can conclude that the presence of the seventh ancestors is due to the addition between fictitious_data and some_other_variable.

In fact, neuronika automatically tracks all the differentiable leaves that are involved in the computation of the output variable when assembling the computational graph corresponding to the issued operations.

If you need to distinguish between the parameters of a model and another differentiable variable or between the parameters of several different models, you can use ModelStatus.

With ModelStatus you can build the exact same neural network only varying the implementation so slightly.

 use neuronika::Param;
 use neuronika::nn::{ModelStatus, Linear};

 struct NeuralNetwork {
    lin1: Linear,
    lin2: Linear,
    lin3: Linear,
    status: ModelStatus,
 }

 impl NeuralNetwork {
     fn new() -> Self {
         // Initialize an empty model status.
         let mut status = ModelStatus::default();
          
         // We register each component whilst at the same time building the network.
         Self {
             lin1: status.register(Linear::new(25, 30)),
             lin2: status.register(Linear::new(30, 35)),
             lin3: status.register(Linear::new(35, 5)),
             status,
         }
     }
      
     /// Returns the model's parameters.
     fn parameters(&self) -> Vec<Param> {
         // We are now able to access the parameter of the neural network.
         self.status.parameters()
     }
 }

At last, we verify that the number of registered parameters for the new version of our neural network is indeed 6.

let model = NeuralNetwork::new();
assert_eq!(model.parameters().len(), 6);

Do also note that in spite of the introduction of ModelStatus, the implementation of the .forward() method has not changed at all.

Train and Eval

The status of a model determines the behavior of its components. Certain building blocks, such as the Dropout, are turned on and off depending on whether the model is running in training mode or in inference mode.

You can set a network in training mode or in inference mode either by calling .train() and .eval() directly on its output or by using ModelStatus.

The former approach is preferable, as when multiple models are pipelined, calling .train() and .eval() directly on the final outputs will switch the statuses of all the models. Do also note that switching the status by using ModelStatus is the only way that allows for selectively training and evaluating multiple models.

Let’s picture it with a simple example.

 use neuronika::Param;
 use neuronika::nn::{ModelStatus, Linear, Dropout};

 struct NeuralNetwork {
    lin1: Linear,
    drop: Dropout,
    lin2: Linear,
    status: ModelStatus,     
 }

 impl NeuralNetwork {
     fn new() -> Self {
         let mut status = ModelStatus::default();
          
         // Similarly to what we did before, we register the components
         // to the network's status.
         // Now the dropout layer, and every other changeable
         // component, can be directly controlled by interacting
         // with the model itself, as it is synced with the one of
         // ModelStatus.
         Self {
             lin1: status.register(Linear::new(25, 35)),
             drop: status.register(Dropout::new(0.5)),
             lin2: status.register(Linear::new(35, 5)),
             status,
         }
     }

     fn parameters(&self) -> Vec<Param> {
         self.status.parameters()
     }
      
     /// Switches the network in training mode.
     fn train(&self) {
         self.status.train()
     }
      
     /// Switches the network in inference mode.
     fn eval(&self) {
         self.status.eval()
     }
 }

Layers

Here are listed all neuronika’s building blocks.

Linear Layers

nn::Linear - Applies a linear transformation to the incoming data.

Recurrent Layers

nn::GRUCell - A gated recurrent unit cell.
nn::LSTMCell - A long short term memory cell.

Convolution Layers

nn::Conv1d - Applies a temporal convolution over an input signal composed of several input planes.
nn::GroupedConv1d - Applies a grouped temporal convolution over an input signal composed of several input planes.
nn::Conv2d - Applies a spatial convolution over an input signal composed of several input planes.
nn::GroupedConv2d - Applies a grouped spatial convolution over an input signal composed of several input planes.
nn::Conv3d - Applies a volumetric convolution over an input signal composed of several input planes.
nn::GroupedConv3d - Applies a grouped volumetric convolution over an input signal composed of several input planes.

Dropout Layers

nn::Dropout - During training, randomly zeroes some of the elements of the input variable with probability p using samples from a Bernoulli distribution.

Modules

init

Layers’ parameters initialization functions.

loss

Loss functions.

Structs

Constant

Constant padding.

Conv1d

Applies a temporal convolution over an input signal composed of several input planes.

Conv2d

Applies a spatial convolution over an input signal composed of several input planes.

Conv3d

Applies a volumetric convolution over an input signal composed of several input planes.

Dropout

During training, randomly zeroes some of the elements of self with probability p using samples from a Bernoulli distribution. Each channel will be zeroed out independently on every forward call.

GRUCell

A gated recurrent unit (GRU) cell.

GroupedConv1d

Applies a grouped temporal convolution over an input signal composed of several input planes.

GroupedConv2d

Applies a spatial grouped convolution over an input signal composed of several input planes.

GroupedConv3d

Applies a grouped volumetric convolution over an input signal composed of several input planes.

LSTMCell

A long short-term memory (LSTM) cell.

Linear

Applies a linear transformation to the incoming data.