Expand description
Β§SciRS2 Neural Networks
scirs2-neural provides PyTorch-style neural network building blocks for Rust, with automatic differentiation integration and production-ready training utilities.
Β§π― Key Features
- Layer-based Architecture: Modular neural network layers (Dense, Conv2D, LSTM, etc.)
- Activation Functions: Common activations (ReLU, Sigmoid, Tanh, GELU, etc.)
- Loss Functions: Classification and regression losses
- Training Utilities: Training loops, callbacks, and metrics
- Autograd Integration: Automatic differentiation via scirs2-autograd
- Type Safety: Compile-time shape and type checking where possible
Β§π¦ Module Overview
| Module | Description |
|---|---|
activations_minimal | Activation functions (ReLU, Sigmoid, Tanh, GELU, etc.) |
layers | Neural network layers (Dense, Conv2D, LSTM, Dropout, etc.) |
losses | Loss functions (MSE, CrossEntropy, Focal, Contrastive, etc.) |
training | Training loops and utilities |
autograd | Automatic differentiation integration |
error | Error types and handling |
utils | Helper utilities |
Β§π Quick Start
Β§Installation
Add to your Cargo.toml:
[dependencies]
scirs2-neural = "0.1.5"Β§Building a Simple Neural Network
use scirs2_neural::prelude::*;
use scirs2_core::ndarray::Array2;
use scirs2_core::random::rng;
let mut rng = rng();
// Build a 3-layer MLP for MNIST
let mut model = Sequential::<f32>::new();
model.add(Dense::new(784, 256, Some("relu"), &mut rng).expect("failed to create dense layer"));
model.add(Dense::new(256, 128, Some("relu"), &mut rng).expect("failed to create dense layer"));
model.add(Dense::new(128, 10, None, &mut rng).expect("failed to create dense layer"));
println!("Model created with {} layers", model.len());
assert_eq!(model.len(), 3);Β§Using Individual Layers
use scirs2_neural::prelude::*;
use scirs2_core::ndarray::Array2;
use scirs2_core::random::rng;
let mut rng = rng();
// Dense layer
let dense = Dense::<f32>::new(10, 5, None, &mut rng).expect("failed to create dense layer");
// Activation functions
let relu = ReLU::new();
let sigmoid = Sigmoid::new();
let tanh_act = Tanh::new();
let gelu = GELU::new();
// Normalization layers
let batch_norm = BatchNorm::<f32>::new(5, 0.1, 1e-5, &mut rng).expect("failed to create batch norm");
let layer_norm = LayerNorm::<f32>::new(5, 1e-5, &mut rng).expect("failed to create layer norm");Β§Convolutional Networks
use scirs2_neural::prelude::*;
use scirs2_core::random::rng;
let mut rng = rng();
// Build a simple CNN
let mut model = Sequential::<f32>::new();
// Conv layers (in_channels, out_channels, kernel_size, stride, name)
model.add(Conv2D::new(1, 32, (3, 3), (1, 1), Some("relu")).expect("conv2d failed"));
model.add(Conv2D::new(32, 64, (3, 3), (1, 1), Some("relu")).expect("conv2d failed"));
// Flatten and classify
model.add(Dense::new(64 * 28 * 28, 10, None, &mut rng).expect("dense failed"));
assert_eq!(model.len(), 3);Β§Recurrent Networks (LSTM)
use scirs2_neural::prelude::*;
use scirs2_core::random::rng;
let mut rng = rng();
// Build an LSTM-based model
let mut model = Sequential::<f32>::new();
// LSTM (input_size, hidden_size, rng)
model.add(LSTM::new(100, 256, &mut rng).expect("lstm failed"));
model.add(Dense::new(256, 10, None, &mut rng).expect("dense failed"));
assert_eq!(model.len(), 2);Β§Loss Functions
use scirs2_neural::prelude::*;
// Mean Squared Error (regression)
let mse = MeanSquaredError::new();
// Cross Entropy (classification)
let ce = CrossEntropyLoss::new(1e-7);
// Focal Loss (imbalanced classes)
let focal = FocalLoss::new(2.0, None, 1e-7);
// Contrastive Loss (metric learning)
let contrastive = ContrastiveLoss::new(1.0);
// Triplet Loss (metric learning)
let triplet = TripletLoss::new(1.0);Β§Training a Model
use scirs2_neural::prelude::*;
use scirs2_core::random::rng;
let mut rng = rng();
// Build model
let mut model = Sequential::<f32>::new();
model.add(Dense::new(784, 128, Some("relu"), &mut rng).expect("dense failed"));
model.add(Dense::new(128, 10, None, &mut rng).expect("dense failed"));
// Training configuration
let config = TrainingConfig {
learning_rate: 0.001,
batch_size: 32,
epochs: 10,
validation: Some(ValidationSettings {
enabled: true,
validation_split: 0.2,
batch_size: 32,
num_workers: 0,
}),
..Default::default()
};
// Create training session
let session = TrainingSession::<f32>::new(config);
assert_eq!(model.len(), 2);Β§π§ Available Layers
Β§Core Layers
Dense: Fully connected (linear) layerConv2D: 2D convolutional layerLSTM: Long Short-Term Memory recurrent layer
Β§Activation Layers
ReLU: Rectified Linear UnitSigmoid: Sigmoid activationTanh: Hyperbolic tangentGELU: Gaussian Error Linear UnitSoftmax: Softmax for classification
Β§Normalization Layers
BatchNorm: Batch normalizationLayerNorm: Layer normalization
Β§Regularization Layers
Dropout: Random dropout for regularization
Β§π Loss Functions
Β§Regression
MeanSquaredError: L2 loss for regression
Β§Classification
CrossEntropyLoss: Standard classification lossFocalLoss: For imbalanced classification
Β§Metric Learning
ContrastiveLoss: Pairwise similarity learningTripletLoss: Triplet-based metric learning
Β§π¨ Design Philosophy
scirs2-neural follows PyTorchβs design philosophy:
- Layer-based: Composable building blocks
- Explicit: Clear forward/backward passes
- Flexible: Easy to extend with custom layers
- Type-safe: Leverage Rustβs type system
Β§π Integration with SciRS2 Ecosystem
- scirs2-autograd: Automatic differentiation support
- scirs2-linalg: Matrix operations and decompositions
- scirs2-metrics: Model evaluation metrics
- scirs2-datasets: Sample datasets for training
- scirs2-vision: Computer vision utilities
- scirs2-text: Text processing for NLP models
Β§π Performance
scirs2-neural provides multiple optimization paths:
- Pure Rust: Fast, safe implementations
- SIMD: Vectorized operations where applicable
- Parallel: Multi-threaded training
- GPU: CUDA/Metal support (via scirs2-core)
Β§π Comparison with PyTorch
| Feature | PyTorch | scirs2-neural |
|---|---|---|
| Layer-based API | β | β |
| Autograd | β | β (via scirs2-autograd) |
| GPU Support | β | β (limited) |
| Dynamic Graphs | β | β |
| JIT Compilation | β | β οΈ (planned) |
| Production Deployment | β οΈ | β (native Rust) |
| Type Safety | β | β |
Β§π Examples
See the examples/ directory for complete examples:
mnist_mlp.rs- Multi-layer perceptron for MNISTcifar_cnn.rs- Convolutional network for CIFAR-10sentiment_lstm.rs- LSTM for sentiment analysiscustom_layer.rs- Creating custom layers
Β§π Version
Current version: 0.1.5 (Released January 15, 2026)
Re-exportsΒ§
pub use activations_minimal::Activation;pub use activations_minimal::ReLU;pub use activations_minimal::Sigmoid;pub use activations_minimal::Softmax;pub use activations_minimal::Tanh;pub use activations_minimal::GELU;pub use error::Error;pub use error::NeuralError;pub use error::Result;pub use layers::BatchNorm;pub use layers::Conv2D;pub use layers::Dense;pub use layers::Dropout;pub use layers::Layer;pub use layers::LayerNorm;pub use layers::Sequential;pub use layers::LSTM;pub use losses::ContrastiveLoss;pub use losses::CrossEntropyLoss;pub use losses::FocalLoss;pub use losses::Loss;pub use losses::MeanSquaredError;pub use losses::TripletLoss;pub use training::TrainingConfig;pub use training::TrainingSession;pub use training::EarlyStoppingConfig;pub use training::EnhancedTrainer;pub use training::EnhancedTrainingConfig;pub use training::GradientAccumulationSettings;pub use training::LRWarmupConfig;pub use training::OptimizedDataLoader;pub use training::OptimizedLoaderConfig;pub use training::ProfilingConfig;pub use training::ProfilingResults;pub use training::ProgressConfig;pub use training::TrainingState;pub use training::ValidationConfig;pub use training::WarmupSchedule;pub use serialization::ExtractParameters;pub use serialization::ModelDeserialize;pub use serialization::ModelFormat;pub use serialization::ModelMetadata;pub use serialization::ModelSerialize;pub use serialization::NamedParameters;pub use serialization::SafeTensorsReader;pub use serialization::SafeTensorsWriter;pub use serialization::TensorInfo;pub use training::best_checkpoint;pub use training::checkpoint_dir_name;pub use training::latest_checkpoint;pub use training::list_checkpoints;pub use training::load_checkpoint;pub use training::save_checkpoint;pub use training::CheckpointMetadata;pub use training::OptimizerStateMetadata;pub use training::ParamGroupState;pub use distillation::DistanceMetric;pub use distillation::DistillationConfig;pub use distillation::DistillationMethod;pub use distillation::DistillationResult;pub use distillation::DistillationStatistics;pub use distillation::DistillationTrainer;pub use distillation::EnsembleAggregation;pub use distillation::FeatureAdaptation;pub use quantization::DynamicQuantizer;pub use quantization::MixedBitWidthQuantizer;pub use quantization::PostTrainingQuantizer;pub use quantization::QuantizationAwareTraining;pub use quantization::QuantizationConfig;pub use quantization::QuantizationMode;pub use quantization::QuantizationParams;pub use quantization::QuantizationScheme;pub use quantization::QuantizedTensor;pub use training::find_optimal_lr;pub use training::LRFinder;pub use training::LRFinderConfig;pub use training::LRFinderResult;pub use training::LRFinderStatus;pub use training::LRScheduleType;pub use training::CompetenceSchedule;pub use training::CurriculumConfig;pub use training::CurriculumLearner;pub use training::CurriculumStrategy;pub use training::AggregationMethod;pub use training::ClientSelectionStrategy;pub use training::ClientUpdate;pub use training::FederatedConfig;pub use training::FederatedServer;pub use training::Bottleneck;pub use training::LayerProfile;pub use training::ProfilePhase;pub use training::ProfileSummary;pub use training::TrainingProfiler;pub use training::HParamSpace;pub use training::HParamTuner;pub use training::HParamValue;pub use training::SearchStrategy;pub use training::TrialResult;
ModulesΒ§
- activations
- Activation functions for neural networks
- activations_
minimal - Minimal activation functions without Layer trait dependencies
- autograd
- Automatic differentiation module for neural networks.
- callbacks
- Callback system for neural network training
- data
- Data loading and processing utilities for neural networks
- distillation
- Knowledge distillation utilities for neural networks
- error
- Error types for the neural network module
- layers
- Neural network layers implementation
- linalg
- Neural network specific linear algebra operations
- losses
- Loss functions for neural networks
- models
- Neural network model implementations
- optimizers
- Neural network optimizers
- prelude
- Prelude module with core functionality
- quantization
- Quantization support for neural networks
- serialization
- Module for model serialization and deserialization
- tensor_
ops - Tensor operations for neural network building blocks
- training
- Training utilities and infrastructure
- transformer
- Transformer models implementation
- utils
- Utility functions for neural networks
- visualization
- Visualization tools for neural networks