Expand description
Neural network layers implementation
This module provides implementations of various neural network layers such as dense (fully connected), attention, convolution, pooling, etc. Layers are the fundamental building blocks of neural networks.
§Overview
Neural network layers transform input data through learned parameters (weights and biases).
Each layer implements the Layer
trait, which defines the interface for forward and
backward propagation, parameter management, and training/evaluation modes.
§Available Layer Types
§Core Layers
- Dense: Fully connected linear transformation
- Conv2D: 2D convolutional layers for image processing
- Embedding: Lookup tables for discrete inputs (words, tokens)
§Activation & Regularization
- Dropout: Randomly sets inputs to zero during training
- BatchNorm/LayerNorm: Normalization for stable training
- ActivityRegularization: L1/L2 penalties on activations
§Pooling & Reshaping
- MaxPool2D/AdaptiveMaxPool2D: Spatial downsampling
- GlobalAvgPool2D: Global spatial average pooling
§Attention & Sequence
- MultiHeadAttention: Transformer-style attention mechanism
- LSTM/GRU: Recurrent layers for sequences
- Bidirectional: Wrapper for bidirectional RNNs
§Embedding & Positional
- PositionalEmbedding: Learned positional encodings
- PatchEmbedding: Convert image patches to embeddings
§Examples
§Creating a Simple Dense Layer
use scirs2_neural::layers::{Layer, Dense};
use ndarray::Array;
use rand::rngs::SmallRng;
use rand::SeedableRng;
let mut rng = SmallRng::seed_from_u64(42);
// Create a dense layer: 784 inputs -> 128 outputs with ReLU activation
let dense = Dense::<f64>::new(784, 128, Some("relu"), &mut rng)?;
// Create input batch (batch_size=2, features=784)
let input = Array::zeros((2, 784)).into_dyn();
// Forward pass
let output = dense.forward(&input)?;
assert_eq!(output.shape(), &[2, 128]);
println!("Layer type: {}", dense.layer_type());
println!("Parameters: {}", dense.parameter_count());
§Building a Sequential Model
use scirs2_neural::layers::{Layer, Dense, Dropout};
use scirs2_neural::models::{Sequential, Model};
use ndarray::Array;
use rand::rngs::SmallRng;
use rand::SeedableRng;
let mut rng = SmallRng::seed_from_u64(42);
let mut model: Sequential<f32> = Sequential::new();
// Build a multi-layer network
model.add_layer(Dense::<f32>::new(784, 512, Some("relu"), &mut rng)?);
model.add_layer(Dropout::<f32>::new(0.2, &mut rng)?);
model.add_layer(Dense::<f32>::new(512, 256, Some("relu"), &mut rng)?);
model.add_layer(Dropout::<f32>::new(0.2, &mut rng)?);
model.add_layer(Dense::<f32>::new(256, 10, Some("softmax"), &mut rng)?);
// Input: batch of MNIST-like images (batch_size=32, flattened=784)
let input = Array::zeros((32, 784)).into_dyn();
// Forward pass through entire model
let output = model.forward(&input)?;
assert_eq!(output.shape(), &[32, 10]); // 10-class predictions
println!("Model has {} layers", model.num_layers());
let total_params: usize = model.layers().iter().map(|l| l.parameter_count()).sum();
println!("Total parameters: {}", total_params);
§Using Convolutional Layers
use scirs2_neural::layers::{Layer, Conv2D, MaxPool2D, PaddingMode};
use ndarray::Array;
use rand::rngs::SmallRng;
use rand::SeedableRng;
let mut rng = SmallRng::seed_from_u64(42);
// Create conv layer: 3 input channels -> 32 output channels, 3x3 kernel
let conv = Conv2D::<f64>::new(3, 32, (3, 3), (1, 1), PaddingMode::Same, &mut rng)?;
let pool = MaxPool2D::<f64>::new((2, 2), (2, 2), None)?; // 2x2 max pooling
// Input: batch of RGB images (batch=4, channels=3, height=32, width=32)
let input = Array::zeros((4, 3, 32, 32)).into_dyn();
// Apply convolution then pooling
let conv_out = conv.forward(&input)?;
assert_eq!(conv_out.shape(), &[4, 32, 32, 32]); // Same padding preserved size
let pool_out = pool.forward(&conv_out)?;
assert_eq!(pool_out.shape(), &[4, 32, 16, 16]); // Pooling halved spatial dims
§Training vs Evaluation Mode
use scirs2_neural::layers::{Layer, Dropout, BatchNorm};
use ndarray::Array;
use rand::rngs::SmallRng;
use rand::SeedableRng;
let mut rng = SmallRng::seed_from_u64(42);
let dropout = Dropout::<f64>::new(0.5, &mut rng)?;
let mut batchnorm = BatchNorm::<f64>::new(128, 0.9, 1e-5, &mut rng)?;
let input = Array::ones((10, 128)).into_dyn();
// Training mode (default)
assert!(dropout.is_training());
let train_output = dropout.forward(&input)?;
// Some outputs will be zero due to dropout
// Switch to evaluation mode (dropout is immutable in this example)
batchnorm.set_training(false);
let eval_output = dropout.forward(&input)?;
// No dropout applied, all outputs preserved but scaled
§Custom Layer Implementation
use scirs2_neural::layers::Layer;
use scirs2_neural::error::Result;
use ndarray::{Array, ArrayD, ScalarOperand};
use num_traits::Float;
use std::fmt::Debug;
// Custom activation layer that squares the input
struct SquareLayer;
impl<F: Float + Debug + ScalarOperand> Layer<F> for SquareLayer {
fn forward(&self, input: &ArrayD<F>) -> Result<ArrayD<F>> {
Ok(input.mapv(|x| x * x))
}
fn backward(&self, input: &ArrayD<F>, grad_output: &ArrayD<F>) -> Result<ArrayD<F>> {
// Derivative of x^2 is 2x
Ok(grad_output * &input.mapv(|x| x + x))
}
fn update(&mut self, _learning_rate: F) -> Result<()> {
Ok(()) // No parameters to update
}
fn as_any(&self) -> &dyn std::any::Any { self }
fn as_any_mut(&mut self) -> &mut dyn std::any::Any { self }
fn layer_type(&self) -> &str { "Square" }
}
§Layer Design Patterns
§Parameter Initialization
Most layers use random number generators for weight initialization:
- Xavier/Glorot: Good for tanh/sigmoid activations
- He/Kaiming: Better for ReLU activations
- Random Normal: Simple baseline
§Memory Management
- Use
set_training(false)
during inference to disable dropout and enable batch norm inference - Sequential containers manage memory efficiently by reusing intermediate buffers
- Large models benefit from gradient checkpointing (available in memory_efficient module)
§Gradient Flow
- Always implement both
forward
andbackward
methods - The
backward
method should compute gradients w.r.t. inputs and update internal parameter gradients - Use
update
method to apply gradients with learning rate
Re-exports§
pub use dense::Dense;
pub use recurrent::Bidirectional;
pub use recurrent::GRUConfig;
pub use recurrent::LSTMConfig;
pub use recurrent::RNNConfig;
pub use recurrent::RecurrentActivation;
pub use recurrent::GRU;
pub use recurrent::LSTM;
pub use recurrent::RNN;
Modules§
- dense
- Dense (fully connected) layer implementation
- recurrent
- Recurrent neural network layers implementations
Structs§
- Activity
Regularization - Activity regularization layer
- Adaptive
AvgPool1D - 1D Adaptive Average Pooling layer
- Adaptive
AvgPool2D - Adaptive Average Pooling 2D layer
- Adaptive
AvgPool3D - 3D Adaptive Average Pooling layer
- Adaptive
MaxPool1D - 1D Adaptive Max Pooling layer
- Adaptive
MaxPool2D - Adaptive Max Pooling 2D layer
- Adaptive
MaxPool3D - 3D Adaptive Max Pooling layer
- Attention
Config - Configuration for attention
- Batch
Norm - Batch Normalization layer
- Conv2D
- 2D Convolutional layer for neural networks
- Dropout
- Dropout layer
- Embedding
- Embedding layer that stores embeddings for discrete inputs
- Embedding
Config - Configuration for the Embedding layer
- Global
AvgPool2D - Global Average Pooling 2D layer
- L1Activity
Regularization - L1 Activity Regularization layer
- L2Activity
Regularization - L2 Activity Regularization layer
- Layer
Norm - Layer Normalization layer
- Layer
Norm2D - 2D Layer Normalization for 2D convolutional networks
- MaxPool2D
- 2D MaxPooling layer for neural networks
- Multi
Head Attention - Multi-head attention layer as used in transformer architectures
- Patch
Embedding - Patch Embedding layer for vision transformers
- Positional
Embedding - Positional Embedding layer for transformers and sequence models
- Self
Attention - Self-attention layer that uses the same input for query, key, and value
- Sequential
- Sequential container for neural network layers
- Thread
Safe Bidirectional - Thread-safe version of Bidirectional RNN wrapper
- Thread
SafeRNN - Thread-safe version of RNN for sequence processing
Enums§
- Attention
Mask - Different types of attention masks
- Layer
Config - Configuration enum for different types of layers
- Padding
Mode - Padding mode for convolutional layers
- Thread
Safe Recurrent Activation - Activation function types for recurrent layers
Traits§
- Layer
- Base trait for neural network layers
- Param
Layer - Trait for layers with parameters (weights, biases)