MiniNN

A minimalist deep learnig crate for rust.

[!NOTE] This crate is still in development and is not ready for production use.

🔧 Setup

You can add the crate with cargo

cargo add mininn

Alternatively, you can manually add it to your project's Cargo.toml like this:

[dependencies]
mininn = "*" # Change the `*` to the current version

✏️ Quick Start: Solving XOR

For this example we will resolve the classic XOR problem

use ndarray::{array, Array1};

use mininn::prelude::*;

fn main() -> NNResult<()> {
    let train_data = array![
        [0.0, 0.0],
        [0.0, 1.0],
        [1.0, 0.0],
        [1.0, 1.0],
    ];

    let labels = array![[0.0], [1.0], [1.0], [0.0],];

    // Create the neural network
    let mut nn = NN::new()
        .add(Dense::new(2, 3).apply(Act::Tanh))?
        .add(Dense::new(3, 1).apply(Act::Tanh))?;

    // Set the training configuration
    let train_config = TrainConfig::new()
        .with_epochs(200)
        .with_cost(Cost::MSE)
        .with_learning_rate(0.1)
        .with_batch_size(2)
        .with_verbose(true);

    // Train the neural network
    let loss = nn.train(train_data.view(), labels.view(), train_config)?;

    println!("Predictions:\n");

    let predictions: Array1<f32> = train_data
        .rows()
        .into_iter()
        .map(|input| {
            let pred = nn.predict(input.view()).unwrap();
            let out = if pred[0] >= 0.9 { 1.0 } else { 0.0 };
            println!("{} --> {}", input, out);
            out
        })
        .collect();

    // Calc metrics using MetricsCalculator
    let metrics = MetricsCalculator::new(labels.view(), predictions.view());

    println!("\nConfusion matrix:\n{}\n", metrics.confusion_matrix());

    println!(
        "Accuracy: {}\nRecall: {}\nPrecision: {}\nF1: {}\nLoss: {}",
        metrics.accuracy(),
        metrics.recall(),
        metrics.precision(),
        metrics.f1_score(),
        loss
    );

    // Save the model into a HDF5 file
    match nn.save("model.h5") {
        Ok(_) => println!("Model saved successfully!"),
        Err(e) => println!("Error saving model: {}", e),
    }

    Ok(())
}

Output

Epoch 1/200 - Loss: 0.2636616, Time: 0.000482592 sec
Epoch 2/200 - Loss: 0.265602, Time: 0.000444258 sec
Epoch 3/200 - Loss: 0.26768285, Time: 0.000398091 sec
...
Epoch 198/200 - Loss: 0.0010192227, Time: 0.000600476 sec
Epoch 199/200 - Loss: 0.0009878413, Time: 0.000510074 sec
Epoch 200/200 - Loss: 0.0009578406, Time: 0.000512518 sec

Training Completed!
Total Training Time: 0.11 sec
Predictions:

[0, 0] --> 0
[0, 1] --> 1
[1, 0] --> 1
[1, 1] --> 0

Confusion matrix:
[[2, 0],
 [0, 2]]

Accuracy: 1
Recall: 1
Precision: 1
F1: 1
Loss: 0.0009578406
Model saved successfully!

📊 Train and evaluation

Train the model

In order to train the model, you need to provide the training data, the labels and the training configuration. The training configuration is a struct that contains all the parameters that are used during the training process, such as the number of epochs, the cost function, the learning rate, the batch size, the optimizer, and whether to print the training process or not.

let train_data = array![[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]];
let labels = array![[0.0], [1.0], [1.0], [0.0]];

let loss = nn.train(train_data.view(), labels.view(), TrainConfig::default())?;

Predict the model

Once the model is trained, you can use it to make predictions on new data. To do this, you need to provide the input data to the predict method.

let input = array![1.0, 2.0];
let output = nn.predict(input.view())?;

Metrics

You can also calculate metrics for your models using MetricsCalculator:

let metrics = MetricsCalculator::new(labels.view(), predictions.view());

println!("\nConfusion matrix:\n{}\n", metrics.confusion_matrix());

println!(
    "Accuracy: {}\nRecall: {}\nPrecision: {}\nF1: {}\n",
    metrics.accuracy(),
    metrics.recall(), 
    metrics.precision(),
    metrics.f1_score()
);

This is the output of the iris example

Confusion matrix:
[[26, 0, 0],
 [0, 28, 1],
 [0, 2, 18]]

Accuracy: 0.96
Recall: 0.9551724137931035
Precision: 0.960233918128655
F1: 0.9574098218166016

Save and load models

When you already have a trained model you can save it into a HDF5 file:

nn.save("model.h5").unwrap();
let nn = NN::load("model.h5").unwrap();

🧰 Built-in Components

The crate defines some default layers, activations and costs that can be used in your model:

Default Layers

For now, the crate only offers these types of layers:

Layer	Description
`Dense`	Fully connected layer where each neuron connects to every neuron in the previous layer. It computes the weighted sum of inputs, adds a bias term, and applies an optional activation function (e.g., ReLU, Sigmoid). This layer is fundamental for transforming input data in deep learning models.
`Activation`	Applies a non-linear transformation (activation function) to its inputs. Common activation functions include ReLU, Sigmoid, Tanh, and Softmax. These functions introduce non-linearity to the model, allowing it to learn complex patterns.
`Flatten`	Flattens the input into a 1D array. This layer is useful when the input is a 2D array, but you want to treat it as a 1D array.
`Dropout`	Applies dropout, a regularization technique where randomly selected neurons are ignored during training. This helps prevent overfitting by reducing reliance on specific neurons and forces the network to learn more robust features. Dropout is typically used in the training phase and is deactivated during inference.

[!NOTE] More layers in the future.

Cost functions

The crate also provides a set of cost functions that can be used in the training process, these are represented by the Cost enum::

Cost::MSE: Mean Squared Error. This cost function measures the average squared difference between the predicted and actual values.
```
\text{MSE}(y_p, y) = \frac{1}{n} \sum_{i=1}^{n} (y_p - y)^2
```
Cost::MAE: maps the input to a value between 0 and 1, which is the probability of the input being 1.
```
\text{MAE}(y_p, y) = \frac{1}{n} \sum_{i=1}^{n} |y_p - y|
```
Cost::BCE: maps the input to 0 if it is negative, and the input itself if it is positive.
```
\text{BCE}(y_p, y) = -\frac{1}{n} \sum_{i=1}^{n} y_p \log(y) + (1 - y_p) \log(1 - y)
```
Cost::CCE: maps the input to a value between -1 and 1, which is the ratio of the input to the hyperbolic tangent of the input.
```
\text{CCE}(y_p, y) = -\frac{1}{n} \sum_{i=1}^{n} y_p \log(y)
```

Activation functions

The crate provides a set of activation functions that can be used in the Activation layer, these are represented by the Act enum:

Act::Step: maps the input to 0 if it is negative, and 1 if it is positive.

\text{step}(x) = 
\begin{cases} 
0 & \text{if } x < 0 \\ 
1 & \text{if } x \geq 0 
\end{cases}

Act::Sigmoid: maps the input to a value between 0 and 1, which is the probability of the input being 1.
```
\text{sigmoid}(x) = \frac{1}{1 + e^{-x}}
```
Act::ReLU: maps the input to 0 if it is negative, and the input itself if it is positive.
```
\text{ReLU}(x) = \max(0, x)
```
Act::Tanh: maps the input to a value between -1 and 1, which is the ratio of the input to the hyperbolic tangent of the input.
```
\text{tanh}(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}
```
Act::Softmax: maps the input to a probability distribution over the possible values of the input.
```
\text{softmax}(x) = \frac{e^x}{e^x + \sum_{i=1}^{n} e^{x_i}}
```

🛠️ Customization

One of the main goals of the mininn crate is to provide a flexible and customizable framework for building and training neural networks. This section will cover how to create your own layers, activations and costs and how to register them with the framework.

Custom layers

All layers in the network are required to implement the Layer trait. This ensures that users can define their own custom layers while maintaining compatibility with the framework.

To fulfill this requirement, every layer must also implement the following traits in addition to Layer:

Debug: For inspecting print layers information.
Clone: To enable copying of layer instances.
Serialize and Deserialize: For seamless serialization and deserialization, typically using serde.

Here is a little example about how to create custom layers:

use mininn::prelude::*;
use serde::{Deserialize, Serialize};
use ndarray::ArrayViewD;

// The implementation of the custom layer
#[derive(Layer, Debug, Clone, Serialize, Deserialize)]
struct CustomLayer;

impl TrainLayer for CustomLayer {
    fn forward(&mut self, input: ArrayViewD<f32>, _mode: &NNMode) -> NNResult<ArrayD<f32>> {
        todo!()
    }

    fn backward(
        &mut self,
        output_gradient: ArrayViewD<f32>,
        _learning_rate: f32,
        _optimizer: &Optimizer,
        _mode: &NNMode,
    ) -> NNResult<ArrayD<f32>> {
        todo!()
    }
}

fn main() {
    let nn = NN::new()
        .add(CustomLayer::new()).unwrap();
    nn.save("custom_layer.h5").unwrap();
}

Custom Activation Functions

You can also create your own activation functions by implementing the ActivationFunction and Debug traits.

use mininn::prelude::*;
use ndarray::{array, ArrayViewD};

#[derive(ActivationFunction, Debug, Clone)]
struct CustomActivation;

impl ActCore for CustomActivation {
    fn function(&self, z: &ArrayViewD<f32>) -> ArrayD<f32> {
        z.mapv(|x| x.powi(2))
    }

    fn derivate(&self, z: &ArrayViewD<f32>) -> ArrayD<f32> {
        z.mapv(|x| 2. * x)
    }
}


fn main() {
    let mut nn = NN::new()
        .add(Dense::new(2, 3).apply(CustomActivation))?
        .add(Dense::new(3, 1).apply(CustomActivation))?;
    let dense_layers = nn.extract_layers::<Dense>().unwrap();
    assert_eq!(dense_layers.len(), 2);
    assert_eq!(dense_layers[0].activation().unwrap(), "CustomActivation");
    assert_eq!(dense_layers[1].activation().unwrap(), "CustomActivation");
}

Custom Cost Functions

You can also create your own cost functions by implementing the CostFunction and Debug traits.

use mininn::prelude::*;
use ndarray::{array, ArrayViewD};

#[derive(CostFunction, Debug, Clone)]
struct CustomCost;

impl CostCore for CustomCost {
    fn function(&self, y_p: &ArrayViewD<f32>, y: &ArrayViewD<f32>) -> f32 {
        (y - y_p).abs().mean().unwrap_or(0.)
    }

    fn derivate(&self, y_p: &ArrayViewD<f32>, y: &ArrayViewD<f32>) -> ArrayD<f32> {
        (y_p - y).signum() / y.len() as f32
    }
}

fn main() {
    let mut nn = NN::new()
        .add(Dense::new(2, 3).apply(Act::Tanh))
        .unwrap()
        .add(Dense::new(3, 1).apply(Act::Tanh))
        .unwrap();

    let train_data = array![
        [0.0, 0.0],
        [0.0, 1.0],
        [1.0, 0.0],
        [1.0, 1.0]
    ];
    
    let labels = array![[0.0], [1.0], [1.0], [0.0]];

    let train_config = TrainConfig::new()
        .epochs(1)
        .learning_rate(0.1)
        .cost(CustomCost);

    let train_result = nn.train(train_data.view(), labels.view(), train_config);

    assert!(train_result.is_ok());
}

Register layers, activations and costs

For use your custom layers, activation functions, or cost functions in the load method, you need to register them first:

fn main() {
    // You can use the register builder to register your own layers, activations and costs
    Register::new()
        .with_layer::<CustomLayer>()
        .with_layer::<CustomLayer1>()
        .with_activation::<CustomActivation>()
        .with_cost::<CustomCost>()
        .register();

    // Or you can use the register! macro to register your own layers, activations and costs
    register!(
        layers: [CustomLayer, CustomLayer1],
        acts: [CustomActivation],
        costs: [CustomCost]
    );

    let nn = NN::load("custom_layer.h5").unwrap();
    for layer in nn.extract_layers::<CustomLayer>().unwrap() {
        println!("{}", layer.layer_type())
    }
    println!("{}", nn.train_config().cost.cost_name());
}

The register! macro can be used to register your layers, activations, costs or all of them at once.

register!(
    layers: [CustomLayer, CustomLayer1],
    acts: [CustomActivation],
    costs: [CustomCost]
);

register!(
    layers: [CustomLayer],
    acts: [CustomActivation]
);

register!(layers: [CustomLayer]);

register!(acts: [CustomActivation]);

register!(costs: [CustomCost]);

📋 Examples

There is a multitude of examples resolving classics ML problems, if you want to see the results just run these commands.

cargo run --example iris
cargo run --example xor [optional_path_to_model]     # If no path is provided, the model won't be saved
cargo run --example mnist [optional_path_to_model]   # If no path is provided, the model won't be saved
cargo run --example xor_load_nn <path_to_model>
cargo run --example mnist_load_nn <path_to_model>

📑 Libraries used

ndarray - For manage N-Dimensional Arrays.
ndarray-rand - For manage Random N-Dimensional Arrays.
serde - For serialization.
rmp_serde - For MSGPack serialization.
hdf5 - For model storage.
dyn-clone - For cloning trait objects.

💻 Contributing

Contributions are welcome! Feel free to open issues or submit pull requests, see CONTRIBUTING.md for more information.

🔑 License

MIT - Created by Paco Algar.

mininn 0.1.4