Module nn

Expand description

Neural network modules for deep learning.

This module provides PyTorch-compatible neural network building blocks following the API design described in Paszke et al. (2019).

§Architecture

The nn module is organized around the Module trait, which defines the interface for all neural network layers:

Layers: Linear, Conv1d, Conv2d, Flatten
Pooling: MaxPool1d, MaxPool2d, AvgPool2d, GlobalAvgPool2d
Activations: ReLU, Sigmoid, Tanh, GELU
Normalization: BatchNorm1d, LayerNorm, GroupNorm, InstanceNorm, RMSNorm
Regularization: Dropout, Dropout2d, AlphaDropout
Containers: Sequential, ModuleList, ModuleDict

§Example

use aprender::nn::{Module, Linear, ReLU, Sequential};
use aprender::autograd::Tensor;

// Build a simple MLP
let model = Sequential::new()
    .add(Linear::new(784, 256))
    .add(ReLU::new())
    .add(Linear::new(256, 10));

// Forward pass
let x = Tensor::randn(&[32, 784]);  // batch of 32
let output = model.forward(&x);     // [32, 10]

// Get all parameters for optimizer
let params = model.parameters();

§References

Paszke, A., et al. (2019). PyTorch: An imperative style, high-performance deep learning library. NeurIPS.
Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. AISTATS.
He, K., et al. (2015). Delving deep into rectifiers. ICCV.

Re-exports§

pub use functional as F;
pub use gnn::AdjacencyMatrix;
pub use gnn::GATConv;
pub use gnn::GCNConv;
pub use gnn::MessagePassing;
pub use gnn::SAGEAggregation;
pub use gnn::SAGEConv;
pub use loss::BCEWithLogitsLoss;
pub use loss::CrossEntropyLoss;
pub use loss::L1Loss;
pub use loss::MSELoss;
pub use loss::NLLLoss;
pub use loss::Reduction;
pub use loss::SmoothL1Loss;
pub use optim::Adam;
pub use optim::AdamW;
pub use optim::Optimizer;
pub use optim::RMSprop;
pub use optim::SGD;
pub use scheduler::CosineAnnealingLR;
pub use scheduler::ExponentialLR;
pub use scheduler::LRScheduler;
pub use scheduler::LinearWarmup;
pub use scheduler::PlateauMode;
pub use scheduler::ReduceLROnPlateau;
pub use scheduler::StepLR;
pub use scheduler::WarmupCosineScheduler;

Modules§

functional: Functional interface for neural network operations.
generation: Sequence generation and decoding algorithms.
gnn: Graph Neural Network layers for learning on graph-structured data.
loss: Differentiable loss functions for neural network training.
optim: Gradient-based optimizers for neural network training.
quantization: Quantization-Aware Training (QAT) module.
scheduler: Learning rate schedulers for training neural networks.
self_supervised: Self-Supervised Learning Pretext Tasks.
serialize: Neural network model serialization.
vae: Variational Autoencoder (VAE) module.

Structs§

ALiBi: ALiBi (Attention with Linear Biases) (Press et al., 2022).
AlphaDropout: Alpha Dropout for SELU activations.
AvgPool2d: Average Pooling 2D.
BatchNorm1d: Batch Normalization for 1D inputs (Ioffe & Szegedy, 2015).
Bidirectional: Bidirectional RNN wrapper.
Conv1d: 1D Convolution layer.
Conv2d: 2D Convolution layer.
ConvDimensionNumbers: Fully describes input, kernel, and output data format for a convolution.
DropBlock: DropBlock regularization (Ghiasi et al., 2018).
DropConnect: DropConnect regularization (Wan et al., 2013).
Dropout: Dropout regularization layer.
Dropout2d: 2D Dropout (Spatial Dropout).
Flatten: Flatten layer.
GELU: Gaussian Error Linear Unit (GELU) activation.
GRU: Gated Recurrent Unit (GRU) layer.
GlobalAvgPool2d: Global Average Pooling 2D.
GroupNorm: Group Normalization (Wu & He, 2018).
GroupedQueryAttention: Grouped Query Attention (GQA).
InstanceNorm: Instance Normalization.
LSTM: Long Short-Term Memory (LSTM) layer.
LayerNorm: Layer Normalization (Ba et al., 2016).
LeakyReLU: Leaky ReLU activation: LeakyReLU(x) = max(negative_slope * x, x)
Linear: Fully connected layer: y = xW^T + b
LinearAttention: Linear Attention with kernel feature maps.
MaxPool1d: Max Pooling 1D.
MaxPool2d: Max Pooling 2D.
ModuleDict: Dictionary of named modules with string-based access.
ModuleList: List of modules with index-based access.
MultiHeadAttention: Multi-Head Attention (Vaswani et al., 2017).
PositionalEncoding: Sinusoidal Positional Encoding (Vaswani et al., 2017).
RMSNorm: Root Mean Square Layer Normalization (Zhang & Sennrich, 2019).
ReLU: Rectified Linear Unit activation: ReLU(x) = max(0, x)
RotaryPositionEmbedding: Rotary Position Embedding (RoPE) (Su et al., 2021).
Sequential: Sequential container for chaining modules.
Sigmoid: Sigmoid activation: σ(x) = 1 / (1 + exp(-x))
Softmax: Softmax activation: softmax(x)_i = exp(x_i) / Σ_j exp(x_j)
Tanh: Tanh activation: tanh(x) = (exp(x) - exp(-x)) / (exp(x) + exp(-x))
TransformerDecoderLayer: Transformer Decoder Layer.
TransformerEncoderLayer: Transformer Encoder Layer.

Enums§

ConvLayout: Data layout for convolution inputs and outputs.
KernelLayout: Kernel (weight) layout for convolution filters.

Traits§

Module: Base trait for all neural network modules.

Functions§

generate_causal_mask: Generate causal (triangular) attention mask.
kaiming_normal: Kaiming normal initialization (He et al., 2015).
kaiming_uniform: Kaiming uniform initialization (He et al., 2015).
xavier_normal: Xavier normal initialization (Glorot & Bengio, 2010).
xavier_uniform: Xavier uniform initialization (Glorot & Bengio, 2010).

Module nn

Module nn Copy item path

§Architecture

§Example

§References

Re-exports§

Modules§

Structs§

Enums§

Traits§

Functions§

Module nn