Hextral
A high-performance neural network library for Rust with clean async-first API, advanced activation functions, multiple optimizers, early stopping, and checkpointing capabilities.
Features
Core Architecture
- Multi-layer perceptrons with configurable hidden layers
- Batch normalization for improved training stability and convergence
- Xavier weight initialization for stable gradient flow
- Flexible network topology - specify any number of hidden layers and neurons
- Clean async-first API with intelligent yielding for non-blocking operations
Activation Functions (9 Available)
- ReLU - Rectified Linear Unit (good for most cases)
- Sigmoid - Smooth activation for binary classification
- Tanh - Hyperbolic tangent for centered outputs
- Leaky ReLU - Prevents dying ReLU problem
- ELU - Exponential Linear Unit for smoother gradients
- Linear - For regression output layers
- Swish - Modern activation with smooth derivatives
- GELU - Gaussian Error Linear Unit used in transformers
- Mish - Self-regularizing activation function
- Quaternion - Quaternion-based normalization for 4D data
Loss Functions (5 Available)
- Mean Squared Error (MSE) - Standard regression loss
- Mean Absolute Error (MAE) - Robust to outliers
- Binary Cross-Entropy - Binary classification
- Categorical Cross-Entropy - Multi-class classification
- Huber Loss - Robust hybrid of MSE and MAE
Optimization Algorithms (12 Available)
- Adam - Adaptive moment estimation (recommended for most cases)
- AdamW - Adam with decoupled weight decay
- NAdam - Nesterov-accelerated Adam
- AdaBelief - Adapting stepsizes by belief in observed gradients
- Lion - Evolved sign momentum optimizer
- SGD - Stochastic Gradient Descent (simple and reliable)
- SGD with Momentum - Accelerated gradient descent
- RMSprop - Root mean square propagation
- AdaGrad - Adaptive gradient algorithm
- AdaDelta - Extension of AdaGrad
- LBFGS - Limited-memory BFGS (quasi-Newton method)
- Ranger - Combination of RAdam and LookAhead
Advanced Training Features
- Early Stopping - Automatic training termination based on validation loss
- Checkpointing - Save and restore model weights with bincode serialization
- Regularization - L1/L2 regularization and dropout support
- Batch Training - Configurable batch sizes for memory efficiency
- Training Progress Tracking - Loss history and validation monitoring
- Dual sync/async API for both blocking and non-blocking operations
Async/Concurrent Processing
- Async training methods with cooperative multitasking
- Parallel batch prediction using futures
- Intelligent yielding - only yields for large workloads (>1000 elements)
- Concurrent activation function processing
- Performance-optimized async implementation alongside synchronous methods
Quick Start
Add this to your Cargo.toml:
[]
= "0.7.0"
= "0.33"
= { = "1.0", = ["full"] } # For async features
Basic Async Usage (Recommended)
use ;
use DVector;
async
Advanced Features
use *;
async
- Scalable architecture - Ideal for web services and concurrent applications
- Parallel batch processing - Multiple predictions processed concurrently using futures
Loss Functions
Configure different loss functions for your specific task:
use ;
let mut nn = new;
// For regression tasks
nn.set_loss_function;
nn.set_loss_function;
nn.set_loss_function;
// For classification tasks
nn.set_loss_function;
nn.set_loss_function;
Batch Normalization
Enable batch normalization for improved training stability:
use ;
let mut nn = new;
// Enable batch normalization
nn.enable_batch_norm;
// Set training mode
nn.set_training_mode;
// Train your network...
let loss_history = nn.train;
// Switch to inference mode
nn.set_training_mode;
// Make predictions...
let prediction = nn.predict;
Modern Activation Functions
Use state-of-the-art activation functions:
use ;
// Swish activation (used in EfficientNet)
let mut nn = new;
// GELU activation (used in BERT, GPT)
let mut nn = new;
// Mish activation (self-regularizing)
let mut nn = new;
Regularization
Prevent overfitting with built-in regularization techniques:
use ;
let mut nn = new;
// L2 regularization (Ridge)
nn.set_regularization;
// L1 regularization (Lasso)
nn.set_regularization;
// Dropout regularization
nn.set_regularization;
Different Optimizers
Choose the optimizer that works best for your problem:
// Adam: Good default choice, adaptive learning rates
let optimizer = Adam ;
// SGD: Simple and interpretable
let optimizer = SGD ;
// SGD with Momentum: Accelerated convergence
let optimizer = SGDMomentum ;
Network Introspection
Get insights into your network:
// Network architecture
println!; // [2, 4, 3, 1]
// Parameter count
println!; // 25
// Save/load weights
let weights = nn.get_weights;
nn.set_weights;
API Reference
Core Types
Hextral- Main neural network struct with async-first APIActivationFunction- Enum for activation functions (9 available)Optimizer- Enum for optimization algorithms (12 available)Regularization- Enum for regularization techniquesEarlyStopping- Configuration for automatic training terminationCheckpointConfig- Configuration for model checkpointingLossFunction- Enum for loss functions (5 available)
Primary Methods (All Async)
// Network creation
new // Training with full feature set
async async async async
Configuration Methods
// Batch normalization
Early Stopping & Checkpointing
// Early stopping configuration
let early_stop = new;
// Checkpoint configuration
let checkpoint = new
.save_every // Save every N epochs
.save_best; // Save best model based on validation loss
Performance Tips
- Use ReLU activation for hidden layers in most cases
- Start with Adam optimizer - it adapts learning rates automatically
- Apply L2 regularization if you see overfitting (test loss > train loss)
- Use dropout for large networks to prevent co-adaptation
- Normalize your input data to [0,1] or [-1,1] range for better training stability
Architecture Decisions
- Built on nalgebra for efficient linear algebra operations
- Xavier initialization for stable gradient flow from the start
- Proper error handling throughout the API
- Modular design allowing easy extension of activation functions and optimizers
- Zero-copy predictions where possible for performance
Contributing
We welcome contributions! Please feel free to:
- Report bugs by opening an issue
- Suggest new features or improvements
- Submit pull requests with enhancements
- Improve documentation
- Add more test cases
Changelog
Changelog
v0.7.0 (Latest)
- Removed Redundancy: Eliminated confusing duplicate methods and verbose naming patterns
- Better Performance: Streamlined async implementation with intelligent yielding
- Updated Documentation: All examples now use clean, consistent API
- All Tests Updated: Comprehensive test suite updated for new API patterns
v0.6.0
- Full Async/Await Support: Complete async API alongside synchronous methods
- Intelligent Yielding: Performance-optimized async with yielding only for large workloads (>1000 elements)
- Concurrent Processing: Parallel batch predictions using futures and join_all
- Async Training: Non-blocking training with cooperative multitasking
- Code Optimization: Removed verbose AI-generated patterns, cleaner professional code
- Performance Improvements: Smart async yielding prevents unnecessary overhead
- Enhanced Documentation: Updated examples and API documentation
v0.5.1
- Improved Documentation: Enhanced README with comprehensive examples of all new features
- Better Crates.io Presentation: Updated documentation to properly showcase library capabilities
v0.5.0
- Major Feature Expansion: Added comprehensive loss functions, batch normalization, and modern activation functions
- 5 Loss Functions: MSE, MAE, Binary Cross-Entropy, Categorical Cross-Entropy, Huber Loss
- Batch Normalization: Full implementation with training/inference modes
- 3 New Activation Functions: Swish, GELU, Mish (total of 9 activation functions)
- Code Organization: Separated tests into dedicated files for cleaner library structure
- Enhanced API: Flexible loss function configuration and batch normalization controls
v0.4.0
- Complete rewrite with proper error handling and fixed implementations
- Implemented all documented features - train(), predict(), evaluate() methods
- Fixed critical bugs in batch normalization and backward pass
- Added regularization support - L1, L2, and Dropout
- Improved documentation with usage examples and API reference
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
License
This project is licensed under the MIT OR Apache-2.0 license.