Expand description
Neural network modules for deep learning.
This module provides PyTorch-compatible neural network building blocks following the API design described in Paszke et al. (2019).
§Architecture
The nn module is organized around the Module trait, which defines
the interface for all neural network layers:
- Layers:
Linear,Conv1d,Conv2d,Flatten - Pooling:
MaxPool1d,MaxPool2d,AvgPool2d,GlobalAvgPool2d - Activations:
ReLU,Sigmoid,Tanh,GELU - Normalization:
BatchNorm1d,LayerNorm,GroupNorm,InstanceNorm,RMSNorm - Regularization:
Dropout,Dropout2d,AlphaDropout - Containers:
Sequential,ModuleList,ModuleDict
§Example
ⓘ
use aprender::nn::{Module, Linear, ReLU, Sequential};
use aprender::autograd::Tensor;
// Build a simple MLP
let model = Sequential::new()
.add(Linear::new(784, 256))
.add(ReLU::new())
.add(Linear::new(256, 10));
// Forward pass
let x = Tensor::randn(&[32, 784]); // batch of 32
let output = model.forward(&x); // [32, 10]
// Get all parameters for optimizer
let params = model.parameters();§References
- Paszke, A., et al. (2019).
PyTorch: An imperative style, high-performance deep learning library.NeurIPS. - Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. AISTATS.
- He, K., et al. (2015). Delving deep into rectifiers. ICCV.
Re-exports§
pub use functional as F;pub use gnn::AdjacencyMatrix;pub use gnn::GATConv;pub use gnn::GCNConv;pub use gnn::MessagePassing;pub use gnn::SAGEAggregation;pub use gnn::SAGEConv;pub use loss::BCEWithLogitsLoss;pub use loss::CrossEntropyLoss;pub use loss::L1Loss;pub use loss::MSELoss;pub use loss::NLLLoss;pub use loss::Reduction;pub use loss::SmoothL1Loss;pub use optim::Adam;pub use optim::AdamW;pub use optim::Optimizer;pub use optim::RMSprop;pub use optim::SGD;pub use scheduler::CosineAnnealingLR;pub use scheduler::ExponentialLR;pub use scheduler::LRScheduler;pub use scheduler::LinearWarmup;pub use scheduler::PlateauMode;pub use scheduler::ReduceLROnPlateau;pub use scheduler::StepLR;pub use scheduler::WarmupCosineScheduler;
Modules§
- functional
- Functional interface for neural network operations.
- generation
- Sequence generation and decoding algorithms.
- gnn
- Graph Neural Network layers for learning on graph-structured data.
- loss
- Differentiable loss functions for neural network training.
- optim
- Gradient-based optimizers for neural network training.
- quantization
- Quantization-Aware Training (QAT) module.
- scheduler
- Learning rate schedulers for training neural networks.
- self_
supervised - Self-Supervised Learning Pretext Tasks.
- serialize
- Neural network model serialization.
- vae
- Variational Autoencoder (VAE) module.
Structs§
- ALiBi
ALiBi(Attention with Linear Biases) (Press et al., 2022).- Alpha
Dropout - Alpha Dropout for SELU activations.
- AvgPool2d
- Average Pooling 2D.
- Batch
Norm1d - Batch Normalization for 1D inputs (Ioffe & Szegedy, 2015).
- Bidirectional
- Bidirectional RNN wrapper.
- Conv1d
- 1D Convolution layer.
- Conv2d
- 2D Convolution layer.
- Drop
Block DropBlockregularization (Ghiasi et al., 2018).- Drop
Connect DropConnectregularization (Wan et al., 2013).- Dropout
- Dropout regularization layer.
- Dropout2d
- 2D Dropout (Spatial Dropout).
- Flatten
- Flatten layer.
- GELU
- Gaussian Error Linear Unit (GELU) activation.
- GRU
- Gated Recurrent Unit (GRU) layer.
- Global
AvgPool2d - Global Average Pooling 2D.
- Group
Norm - Group Normalization (Wu & He, 2018).
- Grouped
Query Attention - Grouped Query Attention (GQA).
- Instance
Norm - Instance Normalization.
- LSTM
- Long Short-Term Memory (LSTM) layer.
- Layer
Norm - Layer Normalization (Ba et al., 2016).
- Leaky
ReLU - Leaky
ReLUactivation: LeakyReLU(x) =max(negative_slope* x, x) - Linear
- Fully connected layer: y = xW^T + b
- Linear
Attention - Linear Attention with kernel feature maps.
- MaxPool1d
- Max Pooling 1D.
- MaxPool2d
- Max Pooling 2D.
- Module
Dict - Dictionary of named modules with string-based access.
- Module
List - List of modules with index-based access.
- Multi
Head Attention - Multi-Head Attention (Vaswani et al., 2017).
- Positional
Encoding - Sinusoidal Positional Encoding (Vaswani et al., 2017).
- RMSNorm
- Root Mean Square Layer Normalization (Zhang & Sennrich, 2019).
- ReLU
- Rectified Linear Unit activation: ReLU(x) = max(0, x)
- Rotary
Position Embedding - Rotary Position Embedding (
RoPE) (Su et al., 2021). - Sequential
- Sequential container for chaining modules.
- Sigmoid
- Sigmoid activation: σ(x) = 1 / (1 + exp(-x))
- Softmax
- Softmax activation: softmax(x)_i =
exp(x_i)/Σ_jexp(x_j) - Tanh
- Tanh activation: tanh(x) = (exp(x) - exp(-x)) / (exp(x) + exp(-x))
- Transformer
Decoder Layer - Transformer Decoder Layer.
- Transformer
Encoder Layer - Transformer Encoder Layer.
Traits§
- Module
- Base trait for all neural network modules.
Functions§
- generate_
causal_ mask - Generate causal (triangular) attention mask.
- kaiming_
normal - Kaiming normal initialization (He et al., 2015).
- kaiming_
uniform - Kaiming uniform initialization (He et al., 2015).
- xavier_
normal - Xavier normal initialization (Glorot & Bengio, 2010).
- xavier_
uniform - Xavier uniform initialization (Glorot & Bengio, 2010).