Expand description
Training scaffolds: loss wiring, schedules, callbacks.
Version: 0.1.0-beta.1 | Status: Production Ready
This crate provides comprehensive training infrastructure for Tensorlogic models:
- Loss functions (standard and logical constraint-based)
- Optimizer wrappers around SciRS2
- Training loops with callbacks
- Batch management
- Validation and metrics
- Regularization techniques
- Data augmentation
- Logging and monitoring
- Curriculum learning strategies
- Transfer learning utilities
- Hyperparameter optimization (grid search, random search)
- Cross-validation utilities
- Model ensembling
- Model pruning and compression
- Model quantization (int8, int4, int2)
- Mixed precision training (FP16, BF16)
- Advanced sampling strategies
Structs§
- Accuracy
- Accuracy metric for classification.
- AdaBelief
Optimizer - AdaBelief optimizer (NeurIPS 2020).
- AdaMax
Optimizer - AdaMax optimizer (variant of Adam with infinity norm).
- Adagrad
Optimizer - Adagrad optimizer (Adaptive Gradient).
- Adam
Optimizer - Adam optimizer.
- AdamP
Optimizer - AdamP optimizer with projection-based weight decay.
- AdamW
Optimizer - AdamW optimizer (Adam with decoupled weight decay).
- Attention
Transfer Loss - Attention transfer for distillation based on attention maps.
- Autocast
Context - Automatic Mixed Precision (AMP) context manager.
- Averaging
Ensemble - Averaging ensemble for regression.
- BCEWith
Logits Loss - Binary cross-entropy with logits loss (numerically stable).
- Bagging
Helper - Bagging (Bootstrap Aggregating) utilities.
- Balanced
Accuracy - Balanced accuracy metric. Average of recall per class, useful for imbalanced datasets.
- Batch
Callback - Callback that logs batch progress.
- Batch
Config - Configuration for batch processing.
- Batch
Iterator - Iterator over batches of data.
- Batch
Reweighter - Batch reweighting based on sample importance.
- Bayesian
Optimization - Bayesian Optimization for hyperparameter tuning.
- Callback
List - List of callbacks to execute in order.
- Checkpoint
Callback - Callback for model checkpointing with auto-cleanup.
- Class
Balanced Sampler - Class-balanced sampling for imbalanced datasets.
- Cohens
Kappa - Cohen’s Kappa statistic. Measures inter-rater agreement, accounting for chance agreement. Ranges from -1 to +1, where 1 is perfect agreement, 0 is random chance.
- Competence
Curriculum - Competence-based curriculum: adapts to model’s current competence level.
- Composite
Augmenter - Composite augmenter that applies multiple augmentations sequentially.
- Composite
Regularization - Composite regularization that combines multiple regularizers.
- Confusion
Matrix - Confusion matrix for multi-class classification.
- Console
Logger - Console logger that outputs to stdout.
- Constraint
Violation Loss - Constraint violation loss - penalizes constraint violations.
- Contrastive
Loss - Contrastive loss for metric learning. Used to learn embeddings where similar pairs are close and dissimilar pairs are far apart.
- Cosine
Annealing LrScheduler - Cosine annealing learning rate scheduler. Anneals learning rate using a cosine schedule.
- Cross
Entropy Loss - Cross-entropy loss for classification.
- Cross
Validation Results - Cross-validation result aggregator.
- CsvLoader
- CSV data loader.
- CsvLogger
- CSV logger for easy data analysis.
- Curriculum
Manager - Manager for curriculum learning that tracks training progress.
- Curriculum
Sampler - Curriculum sampling for progressive difficulty.
- CutMix
Augmenter - CutMix augmentation (ICCV 2019).
- CutOut
Augmenter - CutOut augmentation.
- Cyclic
LrScheduler - Cyclic learning rate scheduler.
- Data
Preprocessor - Data preprocessor for normalization and standardization.
- Data
Shuffler - Data shuffler for randomizing training data.
- Dataset
- Dataset container for training data.
- Dice
Coefficient - Dice Coefficient metric (F1 Score variant for segmentation).
- Dice
Loss - Dice loss for segmentation tasks.
- Discriminative
Fine Tuning - Discriminative fine-tuning: use different learning rates for different layers.
- Distillation
Loss - Knowledge distillation loss that combines student predictions with teacher soft targets.
- Drop
Block - DropBlock regularization.
- Drop
Path - DropPath (Stochastic Depth) regularization.
- Dynamic
Range Calibrator - Dynamic range calibration for post-training quantization.
- Early
Stopping Callback - Callback for early stopping based on validation loss.
- Elastic
NetRegularization - Elastic Net regularization (combination of L1 and L2).
- Episode
Sampler - Episode sampler for N-way K-shot tasks.
- Epoch
Callback - Callback that logs training progress.
- Expected
Calibration Error - Expected Calibration Error (ECE) metric.
- Exponential
Curriculum - Exponential curriculum: exponentially increase sample percentage.
- Exponential
LrScheduler - Exponential learning rate scheduler. Decreases learning rate by a factor of gamma every epoch.
- Exponential
Stochastic Depth - Exponential stochastic depth scheduler.
- F1Score
- F1 score metric for classification.
- Feature
Distillation Loss - Feature-based distillation that matches intermediate layer representations.
- Feature
Extractor Mode - Feature extraction mode: freeze entire feature extractor.
- FewShot
Accuracy - Few-shot accuracy evaluator.
- File
Logger - File logger that writes logs to a file.
- Focal
Loss - Focal loss for addressing class imbalance. Reference: Lin et al., “Focal Loss for Dense Object Detection”
- Focal
Sampler - Focal sampling strategy.
- Gaussian
Process - Gaussian Process regressor for Bayesian Optimization.
- GcConfig
- Configuration for gradient centralization.
- GcStats
- Statistics for gradient centralization.
- Global
Pruner - Global pruning across multiple layers.
- Gradient
Accumulation Callback - Gradient Accumulation callback with advanced features.
- Gradient
Accumulation Stats - Statistics for gradient accumulation.
- Gradient
Centralization - Gradient Centralization optimizer wrapper.
- Gradient
Checkpoint Config - Gradient checkpointing configuration.
- Gradient
Monitor - Gradient flow monitor for tracking gradient statistics during training.
- Gradient
Pruner - Gradient-based pruning (prune weights with smallest gradients).
- Gradient
Scaler - Gradient scaler for automatic mixed precision.
- Gradient
Stats - Gradient statistics for monitoring gradient flow.
- Gradient
Summary - Summary of gradient statistics.
- Grid
Search - Grid search strategy for hyperparameter optimization.
- Group
Lasso Regularization - Group Lasso regularization.
- Hard
Negative Miner - Hard negative mining for handling imbalanced datasets.
- Hinge
Loss - Hinge loss for maximum-margin classification (SVM-style).
- Histogram
Callback - Callback for tracking weight histograms during training.
- Histogram
Stats - Weight histogram statistics for debugging and monitoring.
- Huber
Loss - Huber loss for robust regression.
- Hyperparam
Result - Result of a hyperparameter evaluation.
- Importance
Sampler - Importance sampling based on sample scores.
- IoU
- Intersection over Union (IoU) metric for segmentation tasks.
- Jsonl
Logger - JSONL (JSON Lines) logger for machine-readable output.
- KFold
- K-fold cross-validation.
- KLDivergence
Loss - Kullback-Leibler Divergence loss. Measures how one probability distribution diverges from a reference distribution.
- L1Regularization
- L1 regularization (Lasso).
- L2Regularization
- L2 regularization (Ridge / Weight Decay).
- Label
Encoder - Label encoder for converting string labels to integers.
- Label
Smoothing Loss - Label smoothing cross-entropy loss.
- Lamb
Optimizer - LAMB optimizer (Layer-wise Adaptive Moments optimizer for Batch training). Designed for large batch training, uses layer-wise adaptation.
- Lars
Optimizer - LARS optimizer (Layer-wise Adaptive Rate Scaling).
- Layer
Freezing Config - Layer freezing configuration for transfer learning.
- Layer
Pruning Stats - Pruning statistics for a single layer.
- Learning
Rate Finder - Learning rate finder callback using the LR range test.
- Leave
OneOut - Leave-one-out cross-validation.
- Linear
Curriculum - Linear curriculum: gradually increase the percentage of samples used.
- Linear
Drop Block Scheduler - Linear DropBlock scheduler.
- Linear
Model - A simple linear model for testing and demonstration.
- Linear
Stochastic Depth - Linear stochastic depth scheduler.
- Lion
Config - Lion optimizer configuration.
- Lion
Optimizer - Lion optimizer.
- Logical
Loss - Logical loss combining multiple objectives.
- Lookahead
Optimizer - Lookahead optimizer (wrapper that uses slow and fast weights).
- Loss
Config - Configuration for loss functions.
- LrRange
Test Analyzer - Learning rate range test analyzer for finding optimal learning rates.
- MAML
- MAML (Model-Agnostic Meta-Learning) implementation.
- MAML
Config - MAML (Model-Agnostic Meta-Learning) configuration.
- Magnitude
Pruner - Magnitude-based pruning (prune smallest weights).
- Matching
Network - Matching network for few-shot learning.
- Matthews
Correlation Coefficient - Matthews Correlation Coefficient (MCC) metric. Ranges from -1 to +1, where +1 is perfect prediction, 0 is random, -1 is total disagreement. Particularly useful for imbalanced datasets.
- MaxNorm
Regularization - MaxNorm constraint regularizer.
- Maximum
Calibration Error - Maximum Calibration Error (MCE) metric.
- Mean
Average Precision - Mean Average Precision (mAP) metric for object detection and retrieval.
- MeanIoU
- Mean Intersection over Union (mIoU) metric for multi-class segmentation.
- Memory
Budget Manager - Memory budget manager for training.
- Memory
Efficient Training - Memory-efficient training utilities.
- Memory
Profiler Callback - Memory profiler callback for tracking memory usage during training.
- Memory
Settings - Recommended memory settings.
- Memory
Stats - Memory statistics for a training session.
- Meta
Stats - Meta-learning statistics tracker.
- Meta
Task - Meta-learning task representation.
- Metric
Tracker - Metric tracker for managing multiple metrics.
- Metrics
Logger - Metrics logger that aggregates and logs training metrics.
- Mixed
Precision Stats - Statistics for mixed precision training.
- Mixed
Precision Trainer - Mixed precision training manager.
- Mixup
Augmenter - Mixup augmentation.
- Mixup
Loss - Mixup data augmentation that mixes training examples and their labels.
- ModelEMA
Callback - Model EMA (Exponential Moving Average) callback.
- Model
Soup - Model Soup - Weight-space averaging for improved generalization.
- Model
Summary - Model summary containing layer-wise parameter information.
- MseLoss
- Mean squared error loss for regression.
- Multi
Step LrScheduler - Multi-step learning rate scheduler.
- Multi
Task Loss - Multi-task loss that combines multiple losses with configurable weighting.
- NAdam
Optimizer - NAdam optimizer (Nesterov-accelerated Adam).
- NoAugmentation
- No augmentation (identity transformation).
- Noam
Scheduler - Noam scheduler (Transformer learning rate schedule).
- Noise
Augmenter - Gaussian noise augmentation.
- Normalized
Discounted Cumulative Gain - Normalized Discounted Cumulative Gain (NDCG) metric for ranking.
- OneCycle
LrScheduler - One-cycle learning rate scheduler. Increases LR from initial to max, then decreases to min.
- OneHot
Encoder - One-hot encoder for categorical data.
- Online
Hard Example Miner - Online hard example mining during training.
- Optimizer
Config - Configuration for optimizers.
- Orthogonal
Regularization - Orthogonal regularization.
- PCGrad
- PCGrad: Project conflicting gradients for multi-task learning.
- Parameter
Difference - Statistics about parameter differences between two models.
- Parameter
Stats - Model parameter statistics for a single layer or the entire model.
- PerClass
Metrics - Per-class metrics report.
- Poly
Loss - Poly Loss - Polynomial Expansion of Cross-Entropy Loss.
- Polynomial
Decay LrScheduler - Polynomial decay learning rate scheduler.
- Precision
- Precision metric for classification.
- Prodigy
Config - Configuration for Prodigy optimizer
- Prodigy
Optimizer - Prodigy optimizer
- Profiling
Callback - Callback for profiling training performance.
- Profiling
Stats - Performance profiling statistics.
- Progressive
Unfreezing - Progressive unfreezing strategy for transfer learning.
- Prototypical
Distance - Prototypical distance calculator for few-shot learning.
- Pruning
Config - Configuration for pruning strategies.
- Pruning
Stats - Statistics about pruned model.
- Quantization
Aware Training - Quantization-aware training (QAT) utilities.
- Quantization
Config - Configuration for quantization.
- Quantization
Params - Quantization parameters (scale and zero-point).
- Quantized
Tensor - Quantized tensor representation.
- Quantizer
- Main quantizer for model compression.
- RAdam
Optimizer - RAdam optimizer (Rectified Adam) with variance warmup (ICLR 2020).
- RMSprop
Optimizer - RMSprop optimizer (Root Mean Square Propagation).
- Random
Erasing Augmenter - Random Erasing augmentation.
- Random
Search - Random search strategy for hyperparameter optimization.
- Recall
- Recall metric for classification.
- ReduceLR
OnPlateau Scheduler - Reduce learning rate on plateau (metric-based adaptive scheduler).
- Reduce
LrOn Plateau Callback - Callback for learning rate reduction on plateau.
- Reptile
- Reptile meta-learning algorithm.
- Reptile
Config - Reptile algorithm configuration.
- RocCurve
- ROC curve and AUC computation utilities.
- Rotation
Augmenter - Rotation augmentation (placeholder for future implementation).
- Rule
Satisfaction Loss - Rule satisfaction loss - measures how well rules are satisfied.
- SWACallback
- SWA (Stochastic Weight Averaging) callback.
- SamOptimizer
- SAM optimizer (Sharpness Aware Minimization).
- Scale
Augmenter - Scale augmentation.
- Schedule
Free AdamW - Schedule-free AdamW optimizer.
- Schedule
Free Config - Configuration for schedule-free optimizers.
- Self
Paced Curriculum - Self-paced learning: model determines its own learning pace.
- SgdOptimizer
- SGD optimizer with momentum.
- Sgdr
Scheduler - SGDR: Stochastic Gradient Descent with Warm Restarts scheduler.
- Sophia
Config - Configuration for Sophia optimizer with additional Sophia-specific parameters
- Sophia
Optimizer - Sophia optimizer - Second-order optimizer with Hessian diagonal estimation
- Spectral
Normalization - Spectral Normalization regularizer.
- Stacking
Ensemble - Stacking ensemble with a meta-learner.
- Step
LrScheduler - Step-based learning rate scheduler.
Decreases learning rate by a factor every
step_sizeepochs. - StratifiedK
Fold - Stratified K-fold cross-validation.
- Structured
Pruner - Structured pruning (remove entire neurons/channels/filters).
- Support
Set - Support set for few-shot learning.
- Task
Curriculum - Task-level curriculum for multi-task learning.
- Tensor
Board Logger - TensorBoard logger that writes real event files.
- Time
Estimator - Training time estimation based on iteration timing.
- Time
Series Split - Time series split for temporal data.
- TopK
Accuracy - Top-K accuracy metric. Measures whether the correct class is in the top K predictions.
- Trainer
- Main trainer for model training.
- Trainer
Config - Configuration for training.
- Training
Checkpoint - Comprehensive checkpoint data structure.
- Training
History - Training history containing losses and metrics.
- Training
State - Training state passed to callbacks.
- Transfer
Learning Manager - Transfer learning strategy manager.
- Triplet
Loss - Triplet loss for metric learning. Learns embeddings where anchor-positive distance < anchor-negative distance + margin.
- Tversky
Loss - Tversky loss (generalization of Dice loss). Useful for handling class imbalance in segmentation.
- Validation
Callback - Callback for validation during training.
- Voting
Ensemble - Voting ensemble configuration.
- Warmup
Cosine LrScheduler - Warmup with cosine annealing scheduler.
Enums§
- Acquisition
Function - Acquisition function type for Bayesian Optimization.
- BitWidth
- Bit-width for quantization.
- Checkpoint
Compression - Compression method for checkpoints.
- Checkpoint
Strategy - Gradient checkpointing strategy.
- Cyclic
LrMode - Cyclic learning rate mode.
- Distance
Metric - Distance metric for few-shot learning.
- GcStrategy
- Gradient centralization strategy.
- GpKernel
- Gaussian Process kernel for Bayesian Optimization.
- Grad
Clip Mode - Gradient clipping mode.
- Gradient
Scaling Strategy - Gradient scaling strategy for accumulation.
- Granularity
- Quantization granularity (per-tensor or per-channel).
- Hyperparam
Space - Hyperparameter space definition.
- Hyperparam
Value - Hyperparameter value type.
- Loss
Scaler - Loss scaling strategy for mixed precision training.
- Mining
Strategy - Strategy for mining hard examples.
- Plateau
Mode - Mode for ReduceLROnPlateau scheduler.
- Precision
Mode - Precision mode for mixed precision training.
- Preprocessing
Method - Preprocessing method.
- Quantization
Mode - Quantization mode determines the quantization strategy.
- Reweighting
Strategy - Strategy for reweighting samples.
- Shot
Type - Type of shot configuration for few-shot learning.
- Sophia
Variant - Variant of Sophia optimizer to use
- Soup
Recipe - Recipe for creating model soups
- Structured
Pruning Axis - Axis for structured pruning.
- Task
Weighting Strategy - Strategy for weighting multiple tasks.
- Train
Error - Errors that can occur during training.
- Voting
Mode - Voting ensemble for classification.
Traits§
- Autodiff
Model - Trait for models that support automatic differentiation via scirs2-autograd.
- Callback
- Trait for training callbacks.
- Cross
Validation Split - Trait for cross-validation splitting strategies.
- Curriculum
Strategy - Trait for curriculum learning strategies.
- Data
Augmenter - Trait for data augmentation strategies.
- Dynamic
Model - Trait for models with dynamic computation graphs.
- Ensemble
- Trait for ensemble methods.
- Logging
Backend - Trait for logging backends.
- Loss
- Trait for loss functions.
- LrScheduler
- Trait for learning rate schedulers.
- Meta
Learner - Meta-learner trait for different meta-learning algorithms.
- Metric
- Trait for metrics.
- Model
- Trait for trainable models.
- Optimizer
- Trait for optimizers.
- Pruner
- Trait for pruning strategies.
- Regularizer
- Trait for regularization strategies.
Functions§
- compare_
models - Compare two models and report differences in parameters.
- compute_
gradient_ stats - Compute gradient statistics for all layers in a gradient dictionary.
- extract_
batch - Extract batches from data arrays.
- format_
duration - Format a duration in seconds to a human-readable string.
- print_
gradient_ report - Print a formatted report of gradient statistics.
Type Aliases§
- Hyperparam
Config - Hyperparameter configuration (a single point in parameter space).
- Pruning
Mask - Pruning mask indicating which weights are kept (1.0) or removed (0.0).
- Train
Result - Result type for training operations.