Module layers

Source
Expand description

Neural network layers implementation

This module provides implementations of various neural network layers such as dense (fully connected), attention, convolution, pooling, etc. Layers are the fundamental building blocks of neural networks.

§Overview

Neural network layers transform input data through learned parameters (weights and biases). Each layer implements the Layer trait, which defines the interface for forward and backward propagation, parameter management, and training/evaluation modes.

§Available Layer Types

§Core Layers

  • Dense: Fully connected linear transformation
  • Conv2D: 2D convolutional layers for image processing
  • Embedding: Lookup tables for discrete inputs (words, tokens)

§Activation & Regularization

  • Dropout: Randomly sets inputs to zero during training
  • BatchNorm/LayerNorm: Normalization for stable training
  • ActivityRegularization: L1/L2 penalties on activations

§Pooling & Reshaping

  • MaxPool2D/AdaptiveMaxPool2D: Spatial downsampling
  • GlobalAvgPool2D: Global spatial average pooling

§Attention & Sequence

  • MultiHeadAttention: Transformer-style attention mechanism
  • LSTM/GRU: Recurrent layers for sequences
  • Bidirectional: Wrapper for bidirectional RNNs

§Embedding & Positional

  • PositionalEmbedding: Learned positional encodings
  • PatchEmbedding: Convert image patches to embeddings

§Examples

§Creating a Simple Dense Layer

use scirs2_neural::layers::{Layer, Dense};
use ndarray::Array;
use rand::rngs::SmallRng;
use rand::SeedableRng;

let mut rng = SmallRng::seed_from_u64(42);

// Create a dense layer: 784 inputs -> 128 outputs with ReLU activation
let dense = Dense::<f64>::new(784, 128, Some("relu"), &mut rng)?;

// Create input batch (batch_size=2, features=784)
let input = Array::zeros((2, 784)).into_dyn();

// Forward pass
let output = dense.forward(&input)?;
assert_eq!(output.shape(), &[2, 128]);

println!("Layer type: {}", dense.layer_type());
println!("Parameters: {}", dense.parameter_count());

§Building a Sequential Model

use scirs2_neural::layers::{Layer, Dense, Dropout};
use scirs2_neural::models::{Sequential, Model};
use ndarray::Array;
use rand::rngs::SmallRng;
use rand::SeedableRng;

let mut rng = SmallRng::seed_from_u64(42);
let mut model: Sequential<f32> = Sequential::new();

// Build a multi-layer network
model.add_layer(Dense::<f32>::new(784, 512, Some("relu"), &mut rng)?);
model.add_layer(Dropout::<f32>::new(0.2, &mut rng)?);
model.add_layer(Dense::<f32>::new(512, 256, Some("relu"), &mut rng)?);
model.add_layer(Dropout::<f32>::new(0.2, &mut rng)?);
model.add_layer(Dense::<f32>::new(256, 10, Some("softmax"), &mut rng)?);

// Input: batch of MNIST-like images (batch_size=32, flattened=784)
let input = Array::zeros((32, 784)).into_dyn();

// Forward pass through entire model
let output = model.forward(&input)?;
assert_eq!(output.shape(), &[32, 10]); // 10-class predictions

println!("Model has {} layers", model.num_layers());
let total_params: usize = model.layers().iter().map(|l| l.parameter_count()).sum();
println!("Total parameters: {}", total_params);

§Using Convolutional Layers

use scirs2_neural::layers::{Layer, Conv2D, MaxPool2D, PaddingMode};
use ndarray::Array;
use rand::rngs::SmallRng;
use rand::SeedableRng;

let mut rng = SmallRng::seed_from_u64(42);

// Create conv layer: 3 input channels -> 32 output channels, 3x3 kernel
let conv = Conv2D::<f64>::new(3, 32, (3, 3), (1, 1), PaddingMode::Same, &mut rng)?;
let pool = MaxPool2D::<f64>::new((2, 2), (2, 2), None)?; // 2x2 max pooling

// Input: batch of RGB images (batch=4, channels=3, height=32, width=32)
let input = Array::zeros((4, 3, 32, 32)).into_dyn();

// Apply convolution then pooling
let conv_out = conv.forward(&input)?;
assert_eq!(conv_out.shape(), &[4, 32, 32, 32]); // Same padding preserved size

let pool_out = pool.forward(&conv_out)?;
assert_eq!(pool_out.shape(), &[4, 32, 16, 16]); // Pooling halved spatial dims

§Training vs Evaluation Mode

use scirs2_neural::layers::{Layer, Dropout, BatchNorm};
use ndarray::Array;
use rand::rngs::SmallRng;
use rand::SeedableRng;

let mut rng = SmallRng::seed_from_u64(42);
let dropout = Dropout::<f64>::new(0.5, &mut rng)?;
let mut batchnorm = BatchNorm::<f64>::new(128, 0.9, 1e-5, &mut rng)?;

let input = Array::ones((10, 128)).into_dyn();

// Training mode (default)
assert!(dropout.is_training());
let train_output = dropout.forward(&input)?;
// Some outputs will be zero due to dropout

// Switch to evaluation mode (dropout is immutable in this example)
batchnorm.set_training(false);

let eval_output = dropout.forward(&input)?;
// No dropout applied, all outputs preserved but scaled

§Custom Layer Implementation

use scirs2_neural::layers::Layer;
use scirs2_neural::error::Result;
use ndarray::{Array, ArrayD, ScalarOperand};
use num_traits::Float;
use std::fmt::Debug;

// Custom activation layer that squares the input
struct SquareLayer;

impl<F: Float + Debug + ScalarOperand> Layer<F> for SquareLayer {
    fn forward(&self, input: &ArrayD<F>) -> Result<ArrayD<F>> {
        Ok(input.mapv(|x| x * x))
    }

    fn backward(&self, input: &ArrayD<F>, grad_output: &ArrayD<F>) -> Result<ArrayD<F>> {
        // Derivative of x^2 is 2x
        Ok(grad_output * &input.mapv(|x| x + x))
    }

    fn update(&mut self, _learning_rate: F) -> Result<()> {
        Ok(()) // No parameters to update
    }

    fn as_any(&self) -> &dyn std::any::Any { self }
    fn as_any_mut(&mut self) -> &mut dyn std::any::Any { self }
    fn layer_type(&self) -> &str { "Square" }
}

§Layer Design Patterns

§Parameter Initialization

Most layers use random number generators for weight initialization:

  • Xavier/Glorot: Good for tanh/sigmoid activations
  • He/Kaiming: Better for ReLU activations
  • Random Normal: Simple baseline

§Memory Management

  • Use set_training(false) during inference to disable dropout and enable batch norm inference
  • Sequential containers manage memory efficiently by reusing intermediate buffers
  • Large models benefit from gradient checkpointing (available in memory_efficient module)

§Gradient Flow

  • Always implement both forward and backward methods
  • The backward method should compute gradients w.r.t. inputs and update internal parameter gradients
  • Use update method to apply gradients with learning rate

Re-exports§

pub use dense::Dense;
pub use recurrent::Bidirectional;
pub use recurrent::GRUConfig;
pub use recurrent::LSTMConfig;
pub use recurrent::RNNConfig;
pub use recurrent::RecurrentActivation;
pub use recurrent::GRU;
pub use recurrent::LSTM;
pub use recurrent::RNN;

Modules§

dense
Dense (fully connected) layer implementation
recurrent
Recurrent neural network layers implementations

Structs§

ActivityRegularization
Activity regularization layer
AdaptiveAvgPool1D
1D Adaptive Average Pooling layer
AdaptiveAvgPool2D
Adaptive Average Pooling 2D layer
AdaptiveAvgPool3D
3D Adaptive Average Pooling layer
AdaptiveMaxPool1D
1D Adaptive Max Pooling layer
AdaptiveMaxPool2D
Adaptive Max Pooling 2D layer
AdaptiveMaxPool3D
3D Adaptive Max Pooling layer
AttentionConfig
Configuration for attention
BatchNorm
Batch Normalization layer
Conv2D
2D Convolutional layer for neural networks
Dropout
Dropout layer
Embedding
Embedding layer that stores embeddings for discrete inputs
EmbeddingConfig
Configuration for the Embedding layer
GlobalAvgPool2D
Global Average Pooling 2D layer
L1ActivityRegularization
L1 Activity Regularization layer
L2ActivityRegularization
L2 Activity Regularization layer
LayerNorm
Layer Normalization layer
LayerNorm2D
2D Layer Normalization for 2D convolutional networks
MaxPool2D
2D MaxPooling layer for neural networks
MultiHeadAttention
Multi-head attention layer as used in transformer architectures
PatchEmbedding
Patch Embedding layer for vision transformers
PositionalEmbedding
Positional Embedding layer for transformers and sequence models
SelfAttention
Self-attention layer that uses the same input for query, key, and value
Sequential
Sequential container for neural network layers
ThreadSafeBidirectional
Thread-safe version of Bidirectional RNN wrapper
ThreadSafeRNN
Thread-safe version of RNN for sequence processing

Enums§

AttentionMask
Different types of attention masks
LayerConfig
Configuration enum for different types of layers
PaddingMode
Padding mode for convolutional layers
ThreadSafeRecurrentActivation
Activation function types for recurrent layers

Traits§

Layer
Base trait for neural network layers
ParamLayer
Trait for layers with parameters (weights, biases)