Adam

Struct Adam 

Source
pub struct Adam { /* private fields */ }
Expand description

Adam optimizer for neural network parameter optimization

Implements the Adam optimization algorithm with PyTorch-compatible interface. Provides adaptive learning rates with momentum for efficient training of neural networks. The optimizer maintains per-parameter state for momentum and velocity estimates, enabling adaptive learning rates that improve convergence across diverse architectures.

§Usage Pattern

The optimizer uses ID-based parameter linking for maximum flexibility and thread safety:

  • Parameters are linked to the optimizer via add_parameter or add_parameters
  • The step method takes mutable references to parameters for thread-safe updates
  • Parameter states are maintained by tensor ID, allowing for dynamic parameter management
  • Supports serialization and deserialization with parameter re-linking

§Dynamic Parameter Management

Parameters can be added, removed, or re-linked at runtime:

  • add_parameter: Link a single parameter
  • add_parameters: Link multiple parameters at once
  • unlink_parameter: Remove parameter state by ID
  • clear_states: Remove all parameter states
  • is_parameter_linked: Check if a parameter is linked

§Serialization Support

The optimizer supports full serialization and deserialization with state preservation:

  • Parameter states are saved with their shapes and insertion order for validation
  • After deserialization, use relink_parameters to restore saved states to new tensors
  • Parameters must be re-linked in the same chronological order they were originally added
  • Shape validation ensures consistency between saved and current parameters

§Features

  • ID-Based Parameter Linking: Dynamic parameter management via tensor IDs
  • Thread-Safe Step Method: Takes mutable references for safe concurrent access
  • Per-Parameter State: Each parameter maintains its own momentum and velocity buffers
  • Bias Correction: Automatically corrects initialization bias in moment estimates
  • Weight Decay: Optional L2 regularization with efficient implementation
  • AMSGrad Support: Optional AMSGrad variant for improved convergence stability
  • SIMD Optimization: AVX2-optimized updates for maximum performance
  • Full Serialization: Complete state persistence and restoration

§Thread Safety

This type is thread-safe and can be shared between threads. The step method takes mutable references to parameters, ensuring exclusive access during updates.

Implementations§

Source§

impl Adam

Source

pub fn saved_parameter_count(&self) -> usize

Get the number of saved parameter states for checkpoint validation

This method returns the count of parameter states currently stored in the optimizer, which is essential for validating checkpoint integrity and ensuring proper parameter re-linking after deserialization. The count includes all parameters that have been linked to the optimizer and have accumulated optimization state.

§Returns

Number of parameter states currently stored in the optimizer

§Usage Patterns
§Checkpoint Validation

After deserializing an optimizer, this method helps verify that the expected number of parameters were saved and can guide the re-linking process.

§Training Resumption

When resuming training, compare this count with the number of parameters in your model to ensure checkpoint compatibility.

§State Management

Use this method to monitor optimizer state growth and memory usage during training with dynamic parameter addition.

§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;

let weight = Tensor::ones(vec![10, 5]).with_requires_grad();
let bias = Tensor::zeros(vec![5]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);
optimizer.add_parameter(&bias);

// Check parameter count before serialization
assert_eq!(optimizer.saved_parameter_count(), 2);

// Serialize and deserialize
let json = optimizer.to_json().unwrap();
let loaded_optimizer = Adam::from_json(&json).unwrap();

// Verify parameter count is preserved
assert_eq!(loaded_optimizer.saved_parameter_count(), 2);
§Performance
  • Time Complexity: O(1) - Direct access to internal state count
  • Memory Usage: No additional memory allocation
  • Thread Safety: Safe to call from multiple threads concurrently
Source§

impl Adam

Source

pub fn new() -> Self

Create a new Adam optimizer with default configuration

Initializes an Adam optimizer with PyTorch-compatible default hyperparameters. Parameters must be linked separately using add_parameter or add_parameters.

§Returns

A new Adam optimizer instance with default hyperparameters

Examples found in repository?
examples/getting_started/optimizer_basics.rs (line 67)
47fn demonstrate_basic_optimizer_setup() {
48    println!("--- Basic Optimizer Setup ---");
49
50    // Create parameters that require gradients
51    let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52    let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54    println!("Created parameters:");
55    println!(
56        "  Weight: shape {:?}, requires_grad: {}",
57        weight.shape().dims(),
58        weight.requires_grad()
59    );
60    println!(
61        "  Bias: shape {:?}, requires_grad: {}",
62        bias.shape().dims(),
63        bias.requires_grad()
64    );
65
66    // Create Adam optimizer with default configuration
67    let mut optimizer = Adam::new();
68    println!(
69        "Created Adam optimizer with learning rate: {}",
70        optimizer.learning_rate()
71    );
72
73    // Add parameters to optimizer
74    optimizer.add_parameter(&weight);
75    optimizer.add_parameter(&bias);
76    println!(
77        "Added {} parameters to optimizer",
78        optimizer.parameter_count()
79    );
80
81    // Create optimizer with custom configuration
82    let config = AdamConfig {
83        learning_rate: 0.01,
84        beta1: 0.9,
85        beta2: 0.999,
86        eps: 1e-8,
87        weight_decay: 0.0,
88        amsgrad: false,
89    };
90
91    let mut custom_optimizer = Adam::with_config(config);
92    custom_optimizer.add_parameter(&weight);
93    custom_optimizer.add_parameter(&bias);
94
95    println!(
96        "Created custom optimizer with learning rate: {}",
97        custom_optimizer.learning_rate()
98    );
99
100    // Demonstrate parameter linking
101    println!("Parameter linking completed successfully");
102}
More examples
Hide additional examples
examples/optimizers/adam_configurations.rs (line 96)
84fn demonstrate_default_adam() -> Result<(), Box<dyn std::error::Error>> {
85    println!("--- Default Adam Configuration ---");
86
87    // Create a simple regression problem: y = 2*x + 1
88    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
89    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
90
91    // Create model parameters
92    let mut weight = Tensor::randn(vec![1, 1], Some(42)).with_requires_grad();
93    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
94
95    // Create Adam optimizer with default configuration
96    let mut optimizer = Adam::new();
97    optimizer.add_parameter(&weight);
98    optimizer.add_parameter(&bias);
99
100    println!("Default Adam configuration:");
101    println!("  Learning rate: {}", optimizer.learning_rate());
102    println!("  Initial weight: {:.6}", weight.value());
103    println!("  Initial bias: {:.6}", bias.value());
104
105    // Training loop
106    let num_epochs = 50;
107    let mut losses = Vec::new();
108
109    for epoch in 0..num_epochs {
110        // Forward pass
111        let y_pred = x_data.matmul(&weight) + &bias;
112        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
113
114        // Backward pass
115        loss.backward(None);
116
117        // Optimizer step
118        optimizer.step(&mut [&mut weight, &mut bias]);
119        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
120
121        losses.push(loss.value());
122
123        if epoch % 10 == 0 || epoch == num_epochs - 1 {
124            println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
125        }
126    }
127
128    // Evaluate final model
129    let _final_predictions = x_data.matmul(&weight) + &bias;
130    println!("\nFinal model:");
131    println!("  Learned weight: {:.6} (target: 2.0)", weight.value());
132    println!("  Learned bias: {:.6} (target: 1.0)", bias.value());
133    println!("  Final loss: {:.6}", losses[losses.len() - 1]);
134
135    Ok(())
136}
Source

pub fn with_config(config: AdamConfig) -> Self

Create a new Adam optimizer with custom configuration

Allows full control over all Adam hyperparameters for specialized training scenarios such as fine-tuning, transfer learning, or research applications. Parameters must be linked separately using add_parameter or add_parameters.

§Arguments
  • config - Adam configuration with custom hyperparameters
§Returns

A new Adam optimizer instance with the specified configuration

Examples found in repository?
examples/getting_started/optimizer_basics.rs (line 91)
47fn demonstrate_basic_optimizer_setup() {
48    println!("--- Basic Optimizer Setup ---");
49
50    // Create parameters that require gradients
51    let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52    let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54    println!("Created parameters:");
55    println!(
56        "  Weight: shape {:?}, requires_grad: {}",
57        weight.shape().dims(),
58        weight.requires_grad()
59    );
60    println!(
61        "  Bias: shape {:?}, requires_grad: {}",
62        bias.shape().dims(),
63        bias.requires_grad()
64    );
65
66    // Create Adam optimizer with default configuration
67    let mut optimizer = Adam::new();
68    println!(
69        "Created Adam optimizer with learning rate: {}",
70        optimizer.learning_rate()
71    );
72
73    // Add parameters to optimizer
74    optimizer.add_parameter(&weight);
75    optimizer.add_parameter(&bias);
76    println!(
77        "Added {} parameters to optimizer",
78        optimizer.parameter_count()
79    );
80
81    // Create optimizer with custom configuration
82    let config = AdamConfig {
83        learning_rate: 0.01,
84        beta1: 0.9,
85        beta2: 0.999,
86        eps: 1e-8,
87        weight_decay: 0.0,
88        amsgrad: false,
89    };
90
91    let mut custom_optimizer = Adam::with_config(config);
92    custom_optimizer.add_parameter(&weight);
93    custom_optimizer.add_parameter(&bias);
94
95    println!(
96        "Created custom optimizer with learning rate: {}",
97        custom_optimizer.learning_rate()
98    );
99
100    // Demonstrate parameter linking
101    println!("Parameter linking completed successfully");
102}
More examples
Hide additional examples
examples/getting_started/serialization_basics.rs (line 125)
109fn demonstrate_optimizer_serialization() -> Result<(), Box<dyn std::error::Error>> {
110    println!("\n--- Optimizer Serialization ---");
111
112    // Create an optimizer with some parameters
113    let mut weight = Tensor::randn(vec![2, 2], Some(42)).with_requires_grad();
114    let mut bias = Tensor::randn(vec![2], Some(43)).with_requires_grad();
115
116    let config = AdamConfig {
117        learning_rate: 0.001,
118        beta1: 0.9,
119        beta2: 0.999,
120        eps: 1e-8,
121        weight_decay: 0.0,
122        amsgrad: false,
123    };
124
125    let mut optimizer = Adam::with_config(config);
126    optimizer.add_parameter(&weight);
127    optimizer.add_parameter(&bias);
128
129    println!(
130        "Created optimizer with {} parameters",
131        optimizer.parameter_count()
132    );
133    println!("Learning rate: {}", optimizer.learning_rate());
134
135    // Simulate some training steps
136    for _ in 0..3 {
137        let mut loss = weight.sum() + bias.sum();
138        loss.backward(None);
139        optimizer.step(&mut [&mut weight, &mut bias]);
140        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
141    }
142
143    // Save optimizer state
144    let optimizer_path = "temp_optimizer.json";
145    optimizer.save_json(optimizer_path)?;
146    println!("Saved optimizer to: {}", optimizer_path);
147
148    // Load optimizer state
149    let loaded_optimizer = Adam::load_json(optimizer_path)?;
150    println!(
151        "Loaded optimizer with {} parameters",
152        loaded_optimizer.parameter_count()
153    );
154    println!("Learning rate: {}", loaded_optimizer.learning_rate());
155
156    // Verify optimizer state
157    assert_eq!(
158        optimizer.parameter_count(),
159        loaded_optimizer.parameter_count()
160    );
161    assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
162    println!("Optimizer serialization verification: PASSED");
163
164    Ok(())
165}
examples/optimizers/adam_configurations.rs (line 336)
317fn train_with_config(config: TrainingConfig) -> Result<TrainingStats, Box<dyn std::error::Error>> {
318    // Create training data
319    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
320    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
321
322    // Create model parameters
323    let mut weight = Tensor::randn(vec![1, 1], Some(123)).with_requires_grad();
324    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
325
326    // Create optimizer with custom configuration
327    let adam_config = AdamConfig {
328        learning_rate: config.learning_rate,
329        beta1: config.beta1,
330        beta2: config.beta2,
331        eps: 1e-8,
332        weight_decay: config.weight_decay,
333        amsgrad: false,
334    };
335
336    let mut optimizer = Adam::with_config(adam_config);
337    optimizer.add_parameter(&weight);
338    optimizer.add_parameter(&bias);
339
340    // Training loop
341    let mut losses = Vec::new();
342    let mut convergence_epoch = config.epochs;
343
344    for epoch in 0..config.epochs {
345        // Forward pass
346        let y_pred = x_data.matmul(&weight) + &bias;
347        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
348
349        // Backward pass
350        loss.backward(None);
351
352        // Optimizer step
353        optimizer.step(&mut [&mut weight, &mut bias]);
354        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
355
356        let loss_value = loss.value();
357        losses.push(loss_value);
358
359        // Check for convergence (loss < 0.01)
360        if loss_value < 0.01 && convergence_epoch == config.epochs {
361            convergence_epoch = epoch;
362        }
363    }
364
365    Ok(TrainingStats {
366        config,
367        final_loss: losses[losses.len() - 1],
368        loss_history: losses,
369        convergence_epoch,
370        weight_norm: weight.norm().value(),
371    })
372}
examples/neural_networks/basic_linear_layer.rs (line 253)
218fn demonstrate_training_loop() -> Result<(), Box<dyn std::error::Error>> {
219    println!("\n--- Training Loop ---");
220
221    // Create layer and training data
222    let mut layer = LinearLayer::new(2, 1, Some(45));
223
224    // Simple regression task: y = 2*x1 + 3*x2 + 1
225    let x_data = Tensor::from_slice(
226        &[
227            1.0, 1.0, // x1=1, x2=1 -> y=6
228            2.0, 1.0, // x1=2, x2=1 -> y=8
229            1.0, 2.0, // x1=1, x2=2 -> y=9
230            2.0, 2.0, // x1=2, x2=2 -> y=11
231        ],
232        vec![4, 2],
233    )
234    .unwrap();
235
236    let y_true = Tensor::from_slice(&[6.0, 8.0, 9.0, 11.0], vec![4, 1]).unwrap();
237
238    println!("Training data:");
239    println!("  X shape: {:?}", x_data.shape().dims());
240    println!("  Y shape: {:?}", y_true.shape().dims());
241    println!("  Target function: y = 2*x1 + 3*x2 + 1");
242
243    // Create optimizer
244    let config = AdamConfig {
245        learning_rate: 0.01,
246        beta1: 0.9,
247        beta2: 0.999,
248        eps: 1e-8,
249        weight_decay: 0.0,
250        amsgrad: false,
251    };
252
253    let mut optimizer = Adam::with_config(config);
254    let params = layer.parameters();
255    for param in &params {
256        optimizer.add_parameter(param);
257    }
258
259    println!("Optimizer setup complete. Starting training...");
260
261    // Training loop
262    let num_epochs = 100;
263    let mut losses = Vec::new();
264
265    for epoch in 0..num_epochs {
266        // Forward pass
267        let y_pred = layer.forward(&x_data);
268
269        // Compute loss: MSE
270        let diff = y_pred.sub_tensor(&y_true);
271        let mut loss = diff.pow_scalar(2.0).mean();
272
273        // Backward pass
274        loss.backward(None);
275
276        // Optimizer step
277        let mut params = layer.parameters();
278        optimizer.step(&mut params);
279        optimizer.zero_grad(&mut params);
280
281        losses.push(loss.value());
282
283        // Print progress
284        if epoch % 20 == 0 || epoch == num_epochs - 1 {
285            println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
286        }
287    }
288
289    // Evaluate final model
290    let final_predictions = layer.forward_no_grad(&x_data);
291
292    println!("\nFinal model evaluation:");
293    println!("  Learned weights: {:?}", layer.weight.data());
294    println!("  Learned bias: {:?}", layer.bias.data());
295    println!("  Target weights: [2.0, 3.0]");
296    println!("  Target bias: [1.0]");
297
298    println!("  Predictions vs True:");
299    for i in 0..4 {
300        let pred = final_predictions.data()[i];
301        let true_val = y_true.data()[i];
302        println!(
303            "    Sample {}: pred={:.3}, true={:.1}, error={:.3}",
304            i + 1,
305            pred,
306            true_val,
307            (pred - true_val).abs()
308        );
309    }
310
311    // Training analysis
312    let initial_loss = losses[0];
313    let final_loss = losses[losses.len() - 1];
314    let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
315
316    println!("\nTraining Analysis:");
317    println!("  Initial loss: {:.6}", initial_loss);
318    println!("  Final loss: {:.6}", final_loss);
319    println!("  Loss reduction: {:.1}%", loss_reduction);
320
321    Ok(())
322}
Source

pub fn with_learning_rate(learning_rate: f32) -> Self

Create a new Adam optimizer with custom learning rate

A convenience constructor that allows setting only the learning rate while using default values for all other hyperparameters. Parameters must be linked separately using add_parameter or add_parameters.

§Arguments
  • learning_rate - Learning rate for optimization
§Returns

A new Adam optimizer instance with the specified learning rate and default values for all other hyperparameters

Examples found in repository?
examples/optimizers/learning_rate_scheduling.rs (line 332)
319fn train_with_scheduler(
320    scheduler: &mut dyn LearningRateScheduler,
321    num_epochs: usize,
322) -> Result<TrainingStats, Box<dyn std::error::Error>> {
323    // Create training data: y = 2*x + 1
324    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
325    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
326
327    // Create model parameters
328    let mut weight = Tensor::randn(vec![1, 1], Some(456)).with_requires_grad();
329    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
330
331    // Create optimizer with initial learning rate
332    let mut optimizer = Adam::with_learning_rate(0.05);
333    optimizer.add_parameter(&weight);
334    optimizer.add_parameter(&bias);
335
336    // Training loop
337    let mut losses = Vec::new();
338    let mut lr_history = Vec::new();
339    let mut convergence_epoch = num_epochs;
340
341    for epoch in 0..num_epochs {
342        // Forward pass
343        let y_pred = x_data.matmul(&weight) + &bias;
344        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
345
346        // Backward pass
347        loss.backward(None);
348
349        // Update learning rate using scheduler
350        let current_lr = optimizer.learning_rate();
351        let new_lr = scheduler.step(current_lr, epoch, loss.value());
352
353        if (new_lr - current_lr).abs() > 1e-8 {
354            optimizer.set_learning_rate(new_lr);
355        }
356
357        // Optimizer step
358        optimizer.step(&mut [&mut weight, &mut bias]);
359        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
360
361        let loss_value = loss.value();
362        losses.push(loss_value);
363        lr_history.push(new_lr);
364
365        // Check for convergence
366        if loss_value < 0.01 && convergence_epoch == num_epochs {
367            convergence_epoch = epoch;
368        }
369    }
370
371    Ok(TrainingStats {
372        scheduler_name: scheduler.name().to_string(),
373        final_loss: losses[losses.len() - 1],
374        lr_history,
375        loss_history: losses,
376        convergence_epoch,
377    })
378}
More examples
Hide additional examples
examples/neural_networks/basic_linear_layer.rs (line 381)
371fn demonstrate_serialization() -> Result<(), Box<dyn std::error::Error>> {
372    println!("\n--- Serialization ---");
373
374    // Create and train a simple layer
375    let mut original_layer = LinearLayer::new(2, 1, Some(47));
376
377    // Simple training data
378    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
379    let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
380
381    let mut optimizer = Adam::with_learning_rate(0.01);
382    let params = original_layer.parameters();
383    for param in &params {
384        optimizer.add_parameter(param);
385    }
386
387    // Train for a few epochs
388    for _ in 0..10 {
389        let y_pred = original_layer.forward(&x_data);
390        let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
391        loss.backward(None);
392
393        let mut params = original_layer.parameters();
394        optimizer.step(&mut params);
395        optimizer.zero_grad(&mut params);
396    }
397
398    println!("Original layer trained");
399    println!("  Weight: {:?}", original_layer.weight.data());
400    println!("  Bias: {:?}", original_layer.bias.data());
401
402    // Save layer
403    original_layer.save_json("temp_linear_layer")?;
404
405    // Load layer
406    let loaded_layer = LinearLayer::load_json("temp_linear_layer", 2, 1)?;
407
408    println!("Loaded layer");
409    println!("  Weight: {:?}", loaded_layer.weight.data());
410    println!("  Bias: {:?}", loaded_layer.bias.data());
411
412    // Verify consistency
413    let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
414    let original_output = original_layer.forward_no_grad(&test_input);
415    let loaded_output = loaded_layer.forward_no_grad(&test_input);
416
417    println!("Consistency check:");
418    println!("  Original output: {:?}", original_output.data());
419    println!("  Loaded output: {:?}", loaded_output.data());
420    println!(
421        "  Match: {}",
422        original_output
423            .data()
424            .iter()
425            .zip(loaded_output.data().iter())
426            .all(|(a, b)| (a - b).abs() < 1e-6)
427    );
428
429    println!("Serialization verification: PASSED");
430
431    Ok(())
432}
examples/getting_started/serialization_basics.rs (line 212)
204fn demonstrate_model_checkpointing() -> Result<(), Box<dyn std::error::Error>> {
205    println!("\n--- Model Checkpointing ---");
206
207    // Create a simple model (weights and bias)
208    let mut weights = Tensor::randn(vec![2, 1], Some(45)).with_requires_grad();
209    let mut bias = Tensor::randn(vec![1], Some(46)).with_requires_grad();
210
211    // Create optimizer
212    let mut optimizer = Adam::with_learning_rate(0.01);
213    optimizer.add_parameter(&weights);
214    optimizer.add_parameter(&bias);
215
216    println!("Initial weights: {:?}", weights.data());
217    println!("Initial bias: {:?}", bias.data());
218
219    // Simulate training
220    for epoch in 0..5 {
221        let mut loss = weights.sum() + bias.sum();
222        loss.backward(None);
223        optimizer.step(&mut [&mut weights, &mut bias]);
224        optimizer.zero_grad(&mut [&mut weights, &mut bias]);
225
226        if epoch % 2 == 0 {
227            // Save checkpoint
228            let checkpoint_dir = format!("checkpoint_epoch_{}", epoch);
229            fs::create_dir_all(&checkpoint_dir)?;
230
231            weights.save_json(format!("{}/weights.json", checkpoint_dir))?;
232            bias.save_json(format!("{}/bias.json", checkpoint_dir))?;
233            optimizer.save_json(format!("{}/optimizer.json", checkpoint_dir))?;
234
235            println!("Saved checkpoint for epoch {}", epoch);
236        }
237    }
238
239    // Load from checkpoint
240    let loaded_weights = Tensor::load_json("checkpoint_epoch_4/weights.json")?;
241    let loaded_bias = Tensor::load_json("checkpoint_epoch_4/bias.json")?;
242    let loaded_optimizer = Adam::load_json("checkpoint_epoch_4/optimizer.json")?;
243
244    println!("Loaded weights: {:?}", loaded_weights.data());
245    println!("Loaded bias: {:?}", loaded_bias.data());
246    println!(
247        "Loaded optimizer learning rate: {}",
248        loaded_optimizer.learning_rate()
249    );
250
251    // Verify checkpoint integrity
252    assert_eq!(weights.shape().dims(), loaded_weights.shape().dims());
253    assert_eq!(bias.shape().dims(), loaded_bias.shape().dims());
254    assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
255
256    println!("Checkpointing verification: PASSED");
257
258    Ok(())
259}
examples/getting_started/optimizer_basics.rs (line 113)
105fn demonstrate_linear_regression() -> Result<(), Box<dyn std::error::Error>> {
106    println!("\n--- Linear Regression Training ---");
107
108    // Create model parameters
109    let mut weight = Tensor::randn(vec![1, 1], Some(43)).with_requires_grad();
110    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
111
112    // Create optimizer
113    let mut optimizer = Adam::with_learning_rate(0.01);
114    optimizer.add_parameter(&weight);
115    optimizer.add_parameter(&bias);
116
117    // Create simple training data: y = 2*x + 1
118    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
119    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
120
121    println!("Training data:");
122    println!("  X: {:?}", x_data.data());
123    println!("  Y: {:?}", y_true.data());
124    println!("  Target: y = 2*x + 1");
125
126    // Training loop
127    let num_epochs = 100;
128    let mut losses = Vec::new();
129
130    for epoch in 0..num_epochs {
131        // Forward pass: y_pred = x * weight + bias
132        let y_pred = x_data.matmul(&weight) + &bias;
133
134        // Compute loss: MSE
135        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
136
137        // Backward pass
138        loss.backward(None);
139
140        // Optimizer step
141        optimizer.step(&mut [&mut weight, &mut bias]);
142        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
143
144        losses.push(loss.value());
145
146        // Print progress every 20 epochs
147        if epoch % 20 == 0 || epoch == num_epochs - 1 {
148            println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
149        }
150    }
151
152    // Evaluate final model
153    let final_predictions = x_data.matmul(&weight) + &bias;
154    println!("\nFinal model evaluation:");
155    println!("  Learned weight: {:.6}", weight.value());
156    println!("  Learned bias: {:.6}", bias.value());
157    println!("  Predictions vs True:");
158
159    for i in 0..5 {
160        let x1 = x_data.data()[i];
161        let pred = final_predictions.data()[i];
162        let true_val = y_true.data()[i];
163        println!(
164            "    x={:.1}: pred={:.3}, true={:.1}, error={:.3}",
165            x1,
166            pred,
167            true_val,
168            (pred - true_val).abs()
169        );
170    }
171
172    Ok(())
173}
174
175/// Demonstrate advanced training patterns
176fn demonstrate_advanced_training() -> Result<(), Box<dyn std::error::Error>> {
177    println!("\n--- Advanced Training Patterns ---");
178
179    // Create a more complex model
180    let mut weight = Tensor::randn(vec![1, 2], Some(44)).with_requires_grad();
181    let mut bias = Tensor::zeros(vec![2]).with_requires_grad();
182
183    // Create optimizer with different learning rate
184    let mut optimizer = Adam::with_learning_rate(0.005);
185    optimizer.add_parameter(&weight);
186    optimizer.add_parameter(&bias);
187
188    // Create training data: y = 2*x + [1, 3]
189    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
190    let y_true = Tensor::from_slice(
191        &[3.0, 5.0, 7.0, 9.0, 11.0, 6.0, 8.0, 10.0, 12.0, 14.0],
192        vec![5, 2],
193    )
194    .unwrap();
195
196    println!("Advanced training with monitoring:");
197    println!("  Initial learning rate: {}", optimizer.learning_rate());
198
199    // Training loop with monitoring
200    let num_epochs = 50;
201    let mut losses = Vec::new();
202    let mut weight_norms = Vec::new();
203    let mut gradient_norms = Vec::new();
204
205    for epoch in 0..num_epochs {
206        // Forward pass
207        let y_pred = x_data.matmul(&weight) + &bias;
208        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
209
210        // Backward pass
211        loss.backward(None);
212
213        // Compute gradient norm before optimizer step
214        let gradient_norm = weight.grad_owned().unwrap().norm();
215
216        // Optimizer step
217        optimizer.step(&mut [&mut weight, &mut bias]);
218        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
219
220        // Learning rate scheduling: reduce every 10 epochs
221        if epoch > 0 && epoch % 10 == 0 {
222            let current_lr = optimizer.learning_rate();
223            let new_lr = current_lr * 0.5;
224            optimizer.set_learning_rate(new_lr);
225            println!(
226                "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
227                epoch, current_lr, new_lr
228            );
229        }
230
231        // Record metrics
232        losses.push(loss.value());
233        weight_norms.push(weight.norm().value());
234        gradient_norms.push(gradient_norm.value());
235
236        // Print detailed progress
237        if epoch % 10 == 0 || epoch == num_epochs - 1 {
238            println!(
239                "Epoch {:2}: Loss = {:.6}, Weight Norm = {:.6}, Gradient Norm = {:.6}",
240                epoch,
241                loss.value(),
242                weight.norm().value(),
243                gradient_norm.value()
244            );
245        }
246    }
247
248    println!("Final learning rate: {}", optimizer.learning_rate());
249
250    // Analyze training progression
251    let initial_loss = losses[0];
252    let final_loss = losses[losses.len() - 1];
253    let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
254
255    println!("\nTraining Analysis:");
256    println!("  Initial loss: {:.6}", initial_loss);
257    println!("  Final loss: {:.6}", final_loss);
258    println!("  Loss reduction: {:.1}%", loss_reduction);
259    println!("  Final weight norm: {:.6}", weight.norm().value());
260    println!("  Final bias: {:?}", bias.data());
261
262    Ok(())
263}
264
265/// Demonstrate learning rate scheduling
266fn demonstrate_learning_rate_scheduling() -> Result<(), Box<dyn std::error::Error>> {
267    println!("\n--- Learning Rate Scheduling ---");
268
269    // Create simple model
270    let mut weight = Tensor::randn(vec![1, 1], Some(45)).with_requires_grad();
271    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
272
273    // Create optimizer with high initial learning rate
274    let mut optimizer = Adam::with_learning_rate(0.1);
275    optimizer.add_parameter(&weight);
276    optimizer.add_parameter(&bias);
277
278    // Simple data
279    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0], vec![3, 1]).unwrap();
280    let y_true = Tensor::from_slice(&[2.0, 4.0, 6.0], vec![3, 1]).unwrap();
281
282    println!("Initial learning rate: {}", optimizer.learning_rate());
283
284    // Training loop with learning rate scheduling
285    let num_epochs = 50;
286    let mut losses = Vec::new();
287
288    for epoch in 0..num_epochs {
289        // Forward pass
290        let y_pred = x_data.matmul(&weight) + &bias;
291        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
292
293        // Backward pass
294        loss.backward(None);
295
296        // Optimizer step
297        optimizer.step(&mut [&mut weight, &mut bias]);
298        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
299
300        // Learning rate scheduling: reduce every 10 epochs
301        if epoch > 0 && epoch % 10 == 0 {
302            let current_lr = optimizer.learning_rate();
303            let new_lr = current_lr * 0.5;
304            optimizer.set_learning_rate(new_lr);
305            println!(
306                "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
307                epoch, current_lr, new_lr
308            );
309        }
310
311        losses.push(loss.value());
312
313        // Print progress
314        if epoch % 10 == 0 || epoch == num_epochs - 1 {
315            println!(
316                "Epoch {:2}: Loss = {:.6}, LR = {:.3}",
317                epoch,
318                loss.value(),
319                optimizer.learning_rate()
320            );
321        }
322    }
323
324    println!("Final learning rate: {}", optimizer.learning_rate());
325
326    Ok(())
327}
328
329/// Demonstrate training monitoring and analysis
330fn demonstrate_training_monitoring() -> Result<(), Box<dyn std::error::Error>> {
331    println!("\n--- Training Monitoring ---");
332
333    // Create model
334    let mut weight = Tensor::randn(vec![1, 1], Some(46)).with_requires_grad();
335    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
336
337    // Create optimizer
338    let mut optimizer = Adam::with_learning_rate(0.01);
339    optimizer.add_parameter(&weight);
340    optimizer.add_parameter(&bias);
341
342    // Training data
343    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![4, 1]).unwrap();
344    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0], vec![4, 1]).unwrap();
345
346    // Training loop with comprehensive monitoring
347    let num_epochs = 30;
348    let mut losses = Vec::new();
349    let mut weight_history = Vec::new();
350    let mut bias_history = Vec::new();
351
352    for epoch in 0..num_epochs {
353        // Forward pass
354        let y_pred = x_data.matmul(&weight) + &bias;
355        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
356
357        // Backward pass
358        loss.backward(None);
359
360        // Optimizer step
361        optimizer.step(&mut [&mut weight, &mut bias]);
362        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
363
364        // Record history
365        losses.push(loss.value());
366        weight_history.push(weight.value());
367        bias_history.push(bias.value());
368
369        // Print detailed monitoring
370        if epoch % 5 == 0 || epoch == num_epochs - 1 {
371            println!(
372                "Epoch {:2}: Loss = {:.6}, Weight = {:.6}, Bias = {:.6}",
373                epoch,
374                loss.value(),
375                weight.value(),
376                bias.value()
377            );
378        }
379    }
380
381    // Analyze training progression
382    println!("\nTraining Analysis:");
383    println!("  Initial loss: {:.6}", losses[0]);
384    println!("  Final loss: {:.6}", losses[losses.len() - 1]);
385    println!(
386        "  Loss reduction: {:.1}%",
387        (losses[0] - losses[losses.len() - 1]) / losses[0] * 100.0
388    );
389
390    // Compute statistics
391    let loss_mean = compute_mean(&losses);
392    let loss_std = compute_std(&losses);
393    let weight_change = (weight_history[weight_history.len() - 1] - weight_history[0]).abs();
394    let bias_change = (bias_history[bias_history.len() - 1] - bias_history[0]).abs();
395
396    println!("  Average loss: {:.6} ± {:.6}", loss_mean, loss_std);
397    println!("  Weight change: {:.6}", weight_change);
398    println!("  Bias change: {:.6}", bias_change);
399    println!("  Final weight norm: {:.6}", weight.norm().value());
400    println!("  Final bias: {:.6}", bias.value());
401
402    Ok(())
403}
examples/neural_networks/feedforward_network.rs (line 464)
431fn demonstrate_training_workflow() -> Result<(), Box<dyn std::error::Error>> {
432    println!("\n--- Training Workflow ---");
433
434    // Create a simple classification network
435    let config = FeedForwardConfig {
436        input_size: 2,
437        hidden_sizes: vec![4, 3],
438        output_size: 1,
439        use_bias: true,
440    };
441    let mut network = FeedForwardNetwork::new(config, Some(46));
442
443    println!("Training network: 2 -> [4, 3] -> 1");
444
445    // Create simple binary classification data: XOR problem
446    let x_data = Tensor::from_slice(
447        &[
448            0.0, 0.0, // -> 0
449            0.0, 1.0, // -> 1
450            1.0, 0.0, // -> 1
451            1.0, 1.0, // -> 0
452        ],
453        vec![4, 2],
454    )
455    .unwrap();
456
457    let y_true = Tensor::from_slice(&[0.0, 1.0, 1.0, 0.0], vec![4, 1]).unwrap();
458
459    println!("Training on XOR problem:");
460    println!("  Input shape: {:?}", x_data.shape().dims());
461    println!("  Target shape: {:?}", y_true.shape().dims());
462
463    // Create optimizer
464    let mut optimizer = Adam::with_learning_rate(0.1);
465    let params = network.parameters();
466    for param in &params {
467        optimizer.add_parameter(param);
468    }
469
470    // Training loop
471    let num_epochs = 50;
472    let mut losses = Vec::new();
473
474    for epoch in 0..num_epochs {
475        // Forward pass
476        let y_pred = network.forward(&x_data);
477
478        // Compute loss: MSE
479        let diff = y_pred.sub_tensor(&y_true);
480        let mut loss = diff.pow_scalar(2.0).mean();
481
482        // Backward pass
483        loss.backward(None);
484
485        // Optimizer step and zero grad
486        let mut params = network.parameters();
487        optimizer.step(&mut params);
488        optimizer.zero_grad(&mut params);
489
490        losses.push(loss.value());
491
492        // Print progress
493        if epoch % 10 == 0 || epoch == num_epochs - 1 {
494            println!("Epoch {:2}: Loss = {:.6}", epoch, loss.value());
495        }
496    }
497
498    // Test final model
499    let final_predictions = network.forward_no_grad(&x_data);
500    println!("\nFinal predictions vs targets:");
501    for i in 0..4 {
502        let pred = final_predictions.data()[i];
503        let target = y_true.data()[i];
504        let input_x = x_data.data()[i * 2];
505        let input_y = x_data.data()[i * 2 + 1];
506        println!(
507            "  [{:.0}, {:.0}] -> pred: {:.3}, target: {:.0}, error: {:.3}",
508            input_x,
509            input_y,
510            pred,
511            target,
512            (pred - target).abs()
513        );
514    }
515
516    Ok(())
517}
518
519/// Demonstrate comprehensive training with 100+ steps
520fn demonstrate_comprehensive_training() -> Result<(), Box<dyn std::error::Error>> {
521    println!("\n--- Comprehensive Training (100+ Steps) ---");
522
523    // Create a regression network
524    let config = FeedForwardConfig {
525        input_size: 3,
526        hidden_sizes: vec![8, 6, 4],
527        output_size: 2,
528        use_bias: true,
529    };
530    let mut network = FeedForwardNetwork::new(config, Some(47));
531
532    println!("Network architecture: 3 -> [8, 6, 4] -> 2");
533    println!("Total parameters: {}", network.parameter_count());
534
535    // Create synthetic regression data
536    // Target function: [y1, y2] = [x1 + 2*x2 - x3, x1*x2 + x3]
537    let num_samples = 32;
538    let mut x_vec = Vec::new();
539    let mut y_vec = Vec::new();
540
541    for i in 0..num_samples {
542        let x1 = (i as f32 / num_samples as f32) * 2.0 - 1.0; // [-1, 1]
543        let x2 = ((i * 2) as f32 / num_samples as f32) * 2.0 - 1.0;
544        let x3 = ((i * 3) as f32 / num_samples as f32) * 2.0 - 1.0;
545
546        let y1 = x1 + 2.0 * x2 - x3;
547        let y2 = x1 * x2 + x3;
548
549        x_vec.extend_from_slice(&[x1, x2, x3]);
550        y_vec.extend_from_slice(&[y1, y2]);
551    }
552
553    let x_data = Tensor::from_slice(&x_vec, vec![num_samples, 3]).unwrap();
554    let y_true = Tensor::from_slice(&y_vec, vec![num_samples, 2]).unwrap();
555
556    println!("Training data:");
557    println!("  {} samples", num_samples);
558    println!("  Input shape: {:?}", x_data.shape().dims());
559    println!("  Target shape: {:?}", y_true.shape().dims());
560
561    // Create optimizer with learning rate scheduling
562    let mut optimizer = Adam::with_learning_rate(0.01);
563    let params = network.parameters();
564    for param in &params {
565        optimizer.add_parameter(param);
566    }
567
568    // Comprehensive training loop (150 epochs)
569    let num_epochs = 150;
570    let mut losses = Vec::new();
571    let mut best_loss = f32::INFINITY;
572    let mut patience_counter = 0;
573    let patience = 20;
574
575    println!("Starting comprehensive training...");
576
577    for epoch in 0..num_epochs {
578        // Forward pass
579        let y_pred = network.forward(&x_data);
580
581        // Compute loss: MSE
582        let diff = y_pred.sub_tensor(&y_true);
583        let mut loss = diff.pow_scalar(2.0).mean();
584
585        // Backward pass
586        loss.backward(None);
587
588        // Optimizer step and zero grad
589        let mut params = network.parameters();
590        optimizer.step(&mut params);
591        optimizer.zero_grad(&mut params);
592
593        let current_loss = loss.value();
594        losses.push(current_loss);
595
596        // Learning rate scheduling
597        if epoch > 0 && epoch % 30 == 0 {
598            let new_lr = optimizer.learning_rate() * 0.8;
599            optimizer.set_learning_rate(new_lr);
600            println!("  Reduced learning rate to {:.4}", new_lr);
601        }
602
603        // Early stopping logic
604        if current_loss < best_loss {
605            best_loss = current_loss;
606            patience_counter = 0;
607        } else {
608            patience_counter += 1;
609        }
610
611        // Print progress
612        if epoch % 25 == 0 || epoch == num_epochs - 1 {
613            println!(
614                "Epoch {:3}: Loss = {:.6}, LR = {:.4}, Best = {:.6}",
615                epoch,
616                current_loss,
617                optimizer.learning_rate(),
618                best_loss
619            );
620        }
621
622        // Early stopping
623        if patience_counter >= patience && epoch > 50 {
624            println!("Early stopping at epoch {} (patience exceeded)", epoch);
625            break;
626        }
627    }
628
629    // Final evaluation
630    let final_predictions = network.forward_no_grad(&x_data);
631
632    // Compute final metrics
633    let final_loss = losses[losses.len() - 1];
634    let initial_loss = losses[0];
635    let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
636
637    println!("\nTraining completed!");
638    println!("  Initial loss: {:.6}", initial_loss);
639    println!("  Final loss: {:.6}", final_loss);
640    println!("  Best loss: {:.6}", best_loss);
641    println!("  Loss reduction: {:.1}%", loss_reduction);
642    println!("  Final learning rate: {:.4}", optimizer.learning_rate());
643
644    // Sample predictions analysis
645    println!("\nSample predictions (first 5):");
646    for i in 0..5.min(num_samples) {
647        let pred1 = final_predictions.data()[i * 2];
648        let pred2 = final_predictions.data()[i * 2 + 1];
649        let true1 = y_true.data()[i * 2];
650        let true2 = y_true.data()[i * 2 + 1];
651
652        println!(
653            "  Sample {}: pred=[{:.3}, {:.3}], true=[{:.3}, {:.3}], error=[{:.3}, {:.3}]",
654            i + 1,
655            pred1,
656            pred2,
657            true1,
658            true2,
659            (pred1 - true1).abs(),
660            (pred2 - true2).abs()
661        );
662    }
663
664    Ok(())
665}
666
667/// Demonstrate network serialization
668fn demonstrate_network_serialization() -> Result<(), Box<dyn std::error::Error>> {
669    println!("\n--- Network Serialization ---");
670
671    // Create and train a network
672    let config = FeedForwardConfig {
673        input_size: 2,
674        hidden_sizes: vec![4, 2],
675        output_size: 1,
676        use_bias: true,
677    };
678    let mut original_network = FeedForwardNetwork::new(config.clone(), Some(48));
679
680    // Quick training
681    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
682    let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
683
684    let mut optimizer = Adam::with_learning_rate(0.01);
685    let params = original_network.parameters();
686    for param in &params {
687        optimizer.add_parameter(param);
688    }
689
690    for _ in 0..20 {
691        let y_pred = original_network.forward(&x_data);
692        let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
693        loss.backward(None);
694
695        let mut params = original_network.parameters();
696        optimizer.step(&mut params);
697        optimizer.zero_grad(&mut params);
698    }
699
700    // Test original network
701    let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
702    let original_output = original_network.forward_no_grad(&test_input);
703
704    println!("Original network output: {:?}", original_output.data());
705
706    // Save network
707    original_network.save_json("temp_feedforward_network")?;
708
709    // Load network
710    let loaded_network = FeedForwardNetwork::load_json("temp_feedforward_network", config)?;
711    let loaded_output = loaded_network.forward_no_grad(&test_input);
712
713    println!("Loaded network output: {:?}", loaded_output.data());
714
715    // Verify consistency
716    let match_check = original_output
717        .data()
718        .iter()
719        .zip(loaded_output.data().iter())
720        .all(|(a, b)| (a - b).abs() < 1e-6);
721
722    println!(
723        "Serialization verification: {}",
724        if match_check { "PASSED" } else { "FAILED" }
725    );
726
727    Ok(())
728}
Source

pub fn add_parameter(&mut self, parameter: &Tensor)

Add a single parameter to the optimizer

Links a parameter to the optimizer by creating a new parameter state indexed by the tensor’s ID. The parameter must have requires_grad set to true.

§Arguments
  • parameter - Reference to the tensor to link
§Panics

Panics if the parameter does not have requires_grad set to true

Examples found in repository?
examples/getting_started/optimizer_basics.rs (line 74)
47fn demonstrate_basic_optimizer_setup() {
48    println!("--- Basic Optimizer Setup ---");
49
50    // Create parameters that require gradients
51    let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52    let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54    println!("Created parameters:");
55    println!(
56        "  Weight: shape {:?}, requires_grad: {}",
57        weight.shape().dims(),
58        weight.requires_grad()
59    );
60    println!(
61        "  Bias: shape {:?}, requires_grad: {}",
62        bias.shape().dims(),
63        bias.requires_grad()
64    );
65
66    // Create Adam optimizer with default configuration
67    let mut optimizer = Adam::new();
68    println!(
69        "Created Adam optimizer with learning rate: {}",
70        optimizer.learning_rate()
71    );
72
73    // Add parameters to optimizer
74    optimizer.add_parameter(&weight);
75    optimizer.add_parameter(&bias);
76    println!(
77        "Added {} parameters to optimizer",
78        optimizer.parameter_count()
79    );
80
81    // Create optimizer with custom configuration
82    let config = AdamConfig {
83        learning_rate: 0.01,
84        beta1: 0.9,
85        beta2: 0.999,
86        eps: 1e-8,
87        weight_decay: 0.0,
88        amsgrad: false,
89    };
90
91    let mut custom_optimizer = Adam::with_config(config);
92    custom_optimizer.add_parameter(&weight);
93    custom_optimizer.add_parameter(&bias);
94
95    println!(
96        "Created custom optimizer with learning rate: {}",
97        custom_optimizer.learning_rate()
98    );
99
100    // Demonstrate parameter linking
101    println!("Parameter linking completed successfully");
102}
103
104/// Demonstrate simple linear regression training
105fn demonstrate_linear_regression() -> Result<(), Box<dyn std::error::Error>> {
106    println!("\n--- Linear Regression Training ---");
107
108    // Create model parameters
109    let mut weight = Tensor::randn(vec![1, 1], Some(43)).with_requires_grad();
110    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
111
112    // Create optimizer
113    let mut optimizer = Adam::with_learning_rate(0.01);
114    optimizer.add_parameter(&weight);
115    optimizer.add_parameter(&bias);
116
117    // Create simple training data: y = 2*x + 1
118    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
119    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
120
121    println!("Training data:");
122    println!("  X: {:?}", x_data.data());
123    println!("  Y: {:?}", y_true.data());
124    println!("  Target: y = 2*x + 1");
125
126    // Training loop
127    let num_epochs = 100;
128    let mut losses = Vec::new();
129
130    for epoch in 0..num_epochs {
131        // Forward pass: y_pred = x * weight + bias
132        let y_pred = x_data.matmul(&weight) + &bias;
133
134        // Compute loss: MSE
135        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
136
137        // Backward pass
138        loss.backward(None);
139
140        // Optimizer step
141        optimizer.step(&mut [&mut weight, &mut bias]);
142        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
143
144        losses.push(loss.value());
145
146        // Print progress every 20 epochs
147        if epoch % 20 == 0 || epoch == num_epochs - 1 {
148            println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
149        }
150    }
151
152    // Evaluate final model
153    let final_predictions = x_data.matmul(&weight) + &bias;
154    println!("\nFinal model evaluation:");
155    println!("  Learned weight: {:.6}", weight.value());
156    println!("  Learned bias: {:.6}", bias.value());
157    println!("  Predictions vs True:");
158
159    for i in 0..5 {
160        let x1 = x_data.data()[i];
161        let pred = final_predictions.data()[i];
162        let true_val = y_true.data()[i];
163        println!(
164            "    x={:.1}: pred={:.3}, true={:.1}, error={:.3}",
165            x1,
166            pred,
167            true_val,
168            (pred - true_val).abs()
169        );
170    }
171
172    Ok(())
173}
174
175/// Demonstrate advanced training patterns
176fn demonstrate_advanced_training() -> Result<(), Box<dyn std::error::Error>> {
177    println!("\n--- Advanced Training Patterns ---");
178
179    // Create a more complex model
180    let mut weight = Tensor::randn(vec![1, 2], Some(44)).with_requires_grad();
181    let mut bias = Tensor::zeros(vec![2]).with_requires_grad();
182
183    // Create optimizer with different learning rate
184    let mut optimizer = Adam::with_learning_rate(0.005);
185    optimizer.add_parameter(&weight);
186    optimizer.add_parameter(&bias);
187
188    // Create training data: y = 2*x + [1, 3]
189    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
190    let y_true = Tensor::from_slice(
191        &[3.0, 5.0, 7.0, 9.0, 11.0, 6.0, 8.0, 10.0, 12.0, 14.0],
192        vec![5, 2],
193    )
194    .unwrap();
195
196    println!("Advanced training with monitoring:");
197    println!("  Initial learning rate: {}", optimizer.learning_rate());
198
199    // Training loop with monitoring
200    let num_epochs = 50;
201    let mut losses = Vec::new();
202    let mut weight_norms = Vec::new();
203    let mut gradient_norms = Vec::new();
204
205    for epoch in 0..num_epochs {
206        // Forward pass
207        let y_pred = x_data.matmul(&weight) + &bias;
208        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
209
210        // Backward pass
211        loss.backward(None);
212
213        // Compute gradient norm before optimizer step
214        let gradient_norm = weight.grad_owned().unwrap().norm();
215
216        // Optimizer step
217        optimizer.step(&mut [&mut weight, &mut bias]);
218        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
219
220        // Learning rate scheduling: reduce every 10 epochs
221        if epoch > 0 && epoch % 10 == 0 {
222            let current_lr = optimizer.learning_rate();
223            let new_lr = current_lr * 0.5;
224            optimizer.set_learning_rate(new_lr);
225            println!(
226                "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
227                epoch, current_lr, new_lr
228            );
229        }
230
231        // Record metrics
232        losses.push(loss.value());
233        weight_norms.push(weight.norm().value());
234        gradient_norms.push(gradient_norm.value());
235
236        // Print detailed progress
237        if epoch % 10 == 0 || epoch == num_epochs - 1 {
238            println!(
239                "Epoch {:2}: Loss = {:.6}, Weight Norm = {:.6}, Gradient Norm = {:.6}",
240                epoch,
241                loss.value(),
242                weight.norm().value(),
243                gradient_norm.value()
244            );
245        }
246    }
247
248    println!("Final learning rate: {}", optimizer.learning_rate());
249
250    // Analyze training progression
251    let initial_loss = losses[0];
252    let final_loss = losses[losses.len() - 1];
253    let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
254
255    println!("\nTraining Analysis:");
256    println!("  Initial loss: {:.6}", initial_loss);
257    println!("  Final loss: {:.6}", final_loss);
258    println!("  Loss reduction: {:.1}%", loss_reduction);
259    println!("  Final weight norm: {:.6}", weight.norm().value());
260    println!("  Final bias: {:?}", bias.data());
261
262    Ok(())
263}
264
265/// Demonstrate learning rate scheduling
266fn demonstrate_learning_rate_scheduling() -> Result<(), Box<dyn std::error::Error>> {
267    println!("\n--- Learning Rate Scheduling ---");
268
269    // Create simple model
270    let mut weight = Tensor::randn(vec![1, 1], Some(45)).with_requires_grad();
271    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
272
273    // Create optimizer with high initial learning rate
274    let mut optimizer = Adam::with_learning_rate(0.1);
275    optimizer.add_parameter(&weight);
276    optimizer.add_parameter(&bias);
277
278    // Simple data
279    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0], vec![3, 1]).unwrap();
280    let y_true = Tensor::from_slice(&[2.0, 4.0, 6.0], vec![3, 1]).unwrap();
281
282    println!("Initial learning rate: {}", optimizer.learning_rate());
283
284    // Training loop with learning rate scheduling
285    let num_epochs = 50;
286    let mut losses = Vec::new();
287
288    for epoch in 0..num_epochs {
289        // Forward pass
290        let y_pred = x_data.matmul(&weight) + &bias;
291        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
292
293        // Backward pass
294        loss.backward(None);
295
296        // Optimizer step
297        optimizer.step(&mut [&mut weight, &mut bias]);
298        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
299
300        // Learning rate scheduling: reduce every 10 epochs
301        if epoch > 0 && epoch % 10 == 0 {
302            let current_lr = optimizer.learning_rate();
303            let new_lr = current_lr * 0.5;
304            optimizer.set_learning_rate(new_lr);
305            println!(
306                "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
307                epoch, current_lr, new_lr
308            );
309        }
310
311        losses.push(loss.value());
312
313        // Print progress
314        if epoch % 10 == 0 || epoch == num_epochs - 1 {
315            println!(
316                "Epoch {:2}: Loss = {:.6}, LR = {:.3}",
317                epoch,
318                loss.value(),
319                optimizer.learning_rate()
320            );
321        }
322    }
323
324    println!("Final learning rate: {}", optimizer.learning_rate());
325
326    Ok(())
327}
328
329/// Demonstrate training monitoring and analysis
330fn demonstrate_training_monitoring() -> Result<(), Box<dyn std::error::Error>> {
331    println!("\n--- Training Monitoring ---");
332
333    // Create model
334    let mut weight = Tensor::randn(vec![1, 1], Some(46)).with_requires_grad();
335    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
336
337    // Create optimizer
338    let mut optimizer = Adam::with_learning_rate(0.01);
339    optimizer.add_parameter(&weight);
340    optimizer.add_parameter(&bias);
341
342    // Training data
343    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![4, 1]).unwrap();
344    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0], vec![4, 1]).unwrap();
345
346    // Training loop with comprehensive monitoring
347    let num_epochs = 30;
348    let mut losses = Vec::new();
349    let mut weight_history = Vec::new();
350    let mut bias_history = Vec::new();
351
352    for epoch in 0..num_epochs {
353        // Forward pass
354        let y_pred = x_data.matmul(&weight) + &bias;
355        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
356
357        // Backward pass
358        loss.backward(None);
359
360        // Optimizer step
361        optimizer.step(&mut [&mut weight, &mut bias]);
362        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
363
364        // Record history
365        losses.push(loss.value());
366        weight_history.push(weight.value());
367        bias_history.push(bias.value());
368
369        // Print detailed monitoring
370        if epoch % 5 == 0 || epoch == num_epochs - 1 {
371            println!(
372                "Epoch {:2}: Loss = {:.6}, Weight = {:.6}, Bias = {:.6}",
373                epoch,
374                loss.value(),
375                weight.value(),
376                bias.value()
377            );
378        }
379    }
380
381    // Analyze training progression
382    println!("\nTraining Analysis:");
383    println!("  Initial loss: {:.6}", losses[0]);
384    println!("  Final loss: {:.6}", losses[losses.len() - 1]);
385    println!(
386        "  Loss reduction: {:.1}%",
387        (losses[0] - losses[losses.len() - 1]) / losses[0] * 100.0
388    );
389
390    // Compute statistics
391    let loss_mean = compute_mean(&losses);
392    let loss_std = compute_std(&losses);
393    let weight_change = (weight_history[weight_history.len() - 1] - weight_history[0]).abs();
394    let bias_change = (bias_history[bias_history.len() - 1] - bias_history[0]).abs();
395
396    println!("  Average loss: {:.6} ± {:.6}", loss_mean, loss_std);
397    println!("  Weight change: {:.6}", weight_change);
398    println!("  Bias change: {:.6}", bias_change);
399    println!("  Final weight norm: {:.6}", weight.norm().value());
400    println!("  Final bias: {:.6}", bias.value());
401
402    Ok(())
403}
More examples
Hide additional examples
examples/getting_started/serialization_basics.rs (line 126)
109fn demonstrate_optimizer_serialization() -> Result<(), Box<dyn std::error::Error>> {
110    println!("\n--- Optimizer Serialization ---");
111
112    // Create an optimizer with some parameters
113    let mut weight = Tensor::randn(vec![2, 2], Some(42)).with_requires_grad();
114    let mut bias = Tensor::randn(vec![2], Some(43)).with_requires_grad();
115
116    let config = AdamConfig {
117        learning_rate: 0.001,
118        beta1: 0.9,
119        beta2: 0.999,
120        eps: 1e-8,
121        weight_decay: 0.0,
122        amsgrad: false,
123    };
124
125    let mut optimizer = Adam::with_config(config);
126    optimizer.add_parameter(&weight);
127    optimizer.add_parameter(&bias);
128
129    println!(
130        "Created optimizer with {} parameters",
131        optimizer.parameter_count()
132    );
133    println!("Learning rate: {}", optimizer.learning_rate());
134
135    // Simulate some training steps
136    for _ in 0..3 {
137        let mut loss = weight.sum() + bias.sum();
138        loss.backward(None);
139        optimizer.step(&mut [&mut weight, &mut bias]);
140        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
141    }
142
143    // Save optimizer state
144    let optimizer_path = "temp_optimizer.json";
145    optimizer.save_json(optimizer_path)?;
146    println!("Saved optimizer to: {}", optimizer_path);
147
148    // Load optimizer state
149    let loaded_optimizer = Adam::load_json(optimizer_path)?;
150    println!(
151        "Loaded optimizer with {} parameters",
152        loaded_optimizer.parameter_count()
153    );
154    println!("Learning rate: {}", loaded_optimizer.learning_rate());
155
156    // Verify optimizer state
157    assert_eq!(
158        optimizer.parameter_count(),
159        loaded_optimizer.parameter_count()
160    );
161    assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
162    println!("Optimizer serialization verification: PASSED");
163
164    Ok(())
165}
166
167/// Demonstrate format comparison and performance characteristics
168fn demonstrate_format_comparison() -> Result<(), Box<dyn std::error::Error>> {
169    println!("\n--- Format Comparison ---");
170
171    // Create a larger tensor for comparison
172    let tensor = Tensor::randn(vec![10, 10], Some(44));
173
174    // Save in both formats
175    tensor.save_json("temp_comparison.json")?;
176    tensor.save_binary("temp_comparison.bin")?;
177
178    // Compare file sizes
179    let json_size = fs::metadata("temp_comparison.json")?.len();
180    let binary_size = fs::metadata("temp_comparison.bin")?.len();
181
182    println!("JSON file size: {} bytes", json_size);
183    println!("Binary file size: {} bytes", binary_size);
184    println!(
185        "Compression ratio: {:.2}x",
186        json_size as f64 / binary_size as f64
187    );
188
189    // Load and verify both formats
190    let json_tensor = Tensor::load_json("temp_comparison.json")?;
191    let binary_tensor = Tensor::load_binary("temp_comparison.bin")?;
192
193    assert_eq!(tensor.shape().dims(), json_tensor.shape().dims());
194    assert_eq!(tensor.shape().dims(), binary_tensor.shape().dims());
195    assert_eq!(tensor.data(), json_tensor.data());
196    assert_eq!(tensor.data(), binary_tensor.data());
197
198    println!("Format comparison verification: PASSED");
199
200    Ok(())
201}
202
203/// Demonstrate a basic model checkpointing workflow
204fn demonstrate_model_checkpointing() -> Result<(), Box<dyn std::error::Error>> {
205    println!("\n--- Model Checkpointing ---");
206
207    // Create a simple model (weights and bias)
208    let mut weights = Tensor::randn(vec![2, 1], Some(45)).with_requires_grad();
209    let mut bias = Tensor::randn(vec![1], Some(46)).with_requires_grad();
210
211    // Create optimizer
212    let mut optimizer = Adam::with_learning_rate(0.01);
213    optimizer.add_parameter(&weights);
214    optimizer.add_parameter(&bias);
215
216    println!("Initial weights: {:?}", weights.data());
217    println!("Initial bias: {:?}", bias.data());
218
219    // Simulate training
220    for epoch in 0..5 {
221        let mut loss = weights.sum() + bias.sum();
222        loss.backward(None);
223        optimizer.step(&mut [&mut weights, &mut bias]);
224        optimizer.zero_grad(&mut [&mut weights, &mut bias]);
225
226        if epoch % 2 == 0 {
227            // Save checkpoint
228            let checkpoint_dir = format!("checkpoint_epoch_{}", epoch);
229            fs::create_dir_all(&checkpoint_dir)?;
230
231            weights.save_json(format!("{}/weights.json", checkpoint_dir))?;
232            bias.save_json(format!("{}/bias.json", checkpoint_dir))?;
233            optimizer.save_json(format!("{}/optimizer.json", checkpoint_dir))?;
234
235            println!("Saved checkpoint for epoch {}", epoch);
236        }
237    }
238
239    // Load from checkpoint
240    let loaded_weights = Tensor::load_json("checkpoint_epoch_4/weights.json")?;
241    let loaded_bias = Tensor::load_json("checkpoint_epoch_4/bias.json")?;
242    let loaded_optimizer = Adam::load_json("checkpoint_epoch_4/optimizer.json")?;
243
244    println!("Loaded weights: {:?}", loaded_weights.data());
245    println!("Loaded bias: {:?}", loaded_bias.data());
246    println!(
247        "Loaded optimizer learning rate: {}",
248        loaded_optimizer.learning_rate()
249    );
250
251    // Verify checkpoint integrity
252    assert_eq!(weights.shape().dims(), loaded_weights.shape().dims());
253    assert_eq!(bias.shape().dims(), loaded_bias.shape().dims());
254    assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
255
256    println!("Checkpointing verification: PASSED");
257
258    Ok(())
259}
examples/optimizers/adam_configurations.rs (line 97)
84fn demonstrate_default_adam() -> Result<(), Box<dyn std::error::Error>> {
85    println!("--- Default Adam Configuration ---");
86
87    // Create a simple regression problem: y = 2*x + 1
88    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
89    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
90
91    // Create model parameters
92    let mut weight = Tensor::randn(vec![1, 1], Some(42)).with_requires_grad();
93    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
94
95    // Create Adam optimizer with default configuration
96    let mut optimizer = Adam::new();
97    optimizer.add_parameter(&weight);
98    optimizer.add_parameter(&bias);
99
100    println!("Default Adam configuration:");
101    println!("  Learning rate: {}", optimizer.learning_rate());
102    println!("  Initial weight: {:.6}", weight.value());
103    println!("  Initial bias: {:.6}", bias.value());
104
105    // Training loop
106    let num_epochs = 50;
107    let mut losses = Vec::new();
108
109    for epoch in 0..num_epochs {
110        // Forward pass
111        let y_pred = x_data.matmul(&weight) + &bias;
112        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
113
114        // Backward pass
115        loss.backward(None);
116
117        // Optimizer step
118        optimizer.step(&mut [&mut weight, &mut bias]);
119        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
120
121        losses.push(loss.value());
122
123        if epoch % 10 == 0 || epoch == num_epochs - 1 {
124            println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
125        }
126    }
127
128    // Evaluate final model
129    let _final_predictions = x_data.matmul(&weight) + &bias;
130    println!("\nFinal model:");
131    println!("  Learned weight: {:.6} (target: 2.0)", weight.value());
132    println!("  Learned bias: {:.6} (target: 1.0)", bias.value());
133    println!("  Final loss: {:.6}", losses[losses.len() - 1]);
134
135    Ok(())
136}
137
138/// Demonstrate learning rate comparison
139fn demonstrate_learning_rate_comparison() -> Result<(), Box<dyn std::error::Error>> {
140    println!("\n--- Learning Rate Comparison ---");
141
142    let learning_rates = [0.001, 0.01, 0.1];
143    let mut results = Vec::new();
144
145    for &lr in &learning_rates {
146        println!("\nTesting learning rate: {}", lr);
147
148        let stats = train_with_config(TrainingConfig {
149            learning_rate: lr,
150            ..Default::default()
151        })?;
152
153        results.push((lr, stats.clone()));
154
155        println!("  Final loss: {:.6}", stats.final_loss);
156        println!("  Convergence epoch: {}", stats.convergence_epoch);
157    }
158
159    // Compare results
160    println!("\nLearning Rate Comparison Summary:");
161    for (lr, stats) in &results {
162        println!(
163            "  LR={:6}: Loss={:.6}, Converged@{}",
164            lr, stats.final_loss, stats.convergence_epoch
165        );
166    }
167
168    Ok(())
169}
170
171/// Demonstrate weight decay comparison
172fn demonstrate_weight_decay_comparison() -> Result<(), Box<dyn std::error::Error>> {
173    println!("\n--- Weight Decay Comparison ---");
174
175    let weight_decays = [0.0, 0.001, 0.01];
176    let mut results = Vec::new();
177
178    for &wd in &weight_decays {
179        println!("\nTesting weight decay: {}", wd);
180
181        let stats = train_with_config(TrainingConfig {
182            weight_decay: wd,
183            ..Default::default()
184        })?;
185
186        results.push((wd, stats.clone()));
187
188        println!("  Final loss: {:.6}", stats.final_loss);
189        println!("  Final weight norm: {:.6}", stats.weight_norm);
190    }
191
192    // Compare results
193    println!("\nWeight Decay Comparison Summary:");
194    for (wd, stats) in &results {
195        println!(
196            "  WD={:6}: Loss={:.6}, Weight Norm={:.6}",
197            wd, stats.final_loss, stats.weight_norm
198        );
199    }
200
201    Ok(())
202}
203
204/// Demonstrate beta parameter tuning
205fn demonstrate_beta_parameter_tuning() -> Result<(), Box<dyn std::error::Error>> {
206    println!("\n--- Beta Parameter Tuning ---");
207
208    let beta_configs = [
209        (0.9, 0.999),  // Default
210        (0.8, 0.999),  // More aggressive momentum
211        (0.95, 0.999), // Less aggressive momentum
212        (0.9, 0.99),   // Faster second moment decay
213    ];
214
215    let mut results = Vec::new();
216
217    for (i, (beta1, beta2)) in beta_configs.iter().enumerate() {
218        println!(
219            "\nTesting beta configuration {}: beta1={}, beta2={}",
220            i + 1,
221            beta1,
222            beta2
223        );
224
225        let config = TrainingConfig {
226            beta1: *beta1,
227            beta2: *beta2,
228            ..Default::default()
229        };
230
231        let stats = train_with_config(config)?;
232        results.push(((*beta1, *beta2), stats.clone()));
233
234        println!("  Final loss: {:.6}", stats.final_loss);
235        println!("  Convergence epoch: {}", stats.convergence_epoch);
236    }
237
238    // Compare results
239    println!("\nBeta Parameter Comparison Summary:");
240    for ((beta1, beta2), stats) in &results {
241        println!(
242            "  B1={:4}, B2={:5}: Loss={:.6}, Converged@{}",
243            beta1, beta2, stats.final_loss, stats.convergence_epoch
244        );
245    }
246
247    Ok(())
248}
249
250/// Demonstrate configuration benchmarking
251fn demonstrate_configuration_benchmarking() -> Result<(), Box<dyn std::error::Error>> {
252    println!("\n--- Configuration Benchmarking ---");
253
254    // Define configurations to benchmark
255    let configs = vec![
256        (
257            "Conservative",
258            TrainingConfig {
259                learning_rate: 0.001,
260                weight_decay: 0.001,
261                beta1: 0.95,
262                ..Default::default()
263            },
264        ),
265        (
266            "Balanced",
267            TrainingConfig {
268                learning_rate: 0.01,
269                weight_decay: 0.0,
270                beta1: 0.9,
271                ..Default::default()
272            },
273        ),
274        (
275            "Aggressive",
276            TrainingConfig {
277                learning_rate: 0.1,
278                weight_decay: 0.0,
279                beta1: 0.8,
280                ..Default::default()
281            },
282        ),
283    ];
284
285    let mut benchmark_results = Vec::new();
286
287    for (name, config) in configs {
288        println!("\nBenchmarking {} configuration:", name);
289
290        let start_time = std::time::Instant::now();
291        let stats = train_with_config(config.clone())?;
292        let elapsed = start_time.elapsed();
293
294        println!("  Training time: {:.2}ms", elapsed.as_millis());
295        println!("  Final loss: {:.6}", stats.final_loss);
296        println!("  Convergence: {} epochs", stats.convergence_epoch);
297
298        benchmark_results.push((name.to_string(), stats, elapsed));
299    }
300
301    // Summary
302    println!("\nBenchmarking Summary:");
303    for (name, stats, elapsed) in &benchmark_results {
304        println!(
305            "  {:12}: Loss={:.6}, Time={:4}ms, Converged@{}",
306            name,
307            stats.final_loss,
308            elapsed.as_millis(),
309            stats.convergence_epoch
310        );
311    }
312
313    Ok(())
314}
315
316/// Helper function to train with specific configuration
317fn train_with_config(config: TrainingConfig) -> Result<TrainingStats, Box<dyn std::error::Error>> {
318    // Create training data
319    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
320    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
321
322    // Create model parameters
323    let mut weight = Tensor::randn(vec![1, 1], Some(123)).with_requires_grad();
324    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
325
326    // Create optimizer with custom configuration
327    let adam_config = AdamConfig {
328        learning_rate: config.learning_rate,
329        beta1: config.beta1,
330        beta2: config.beta2,
331        eps: 1e-8,
332        weight_decay: config.weight_decay,
333        amsgrad: false,
334    };
335
336    let mut optimizer = Adam::with_config(adam_config);
337    optimizer.add_parameter(&weight);
338    optimizer.add_parameter(&bias);
339
340    // Training loop
341    let mut losses = Vec::new();
342    let mut convergence_epoch = config.epochs;
343
344    for epoch in 0..config.epochs {
345        // Forward pass
346        let y_pred = x_data.matmul(&weight) + &bias;
347        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
348
349        // Backward pass
350        loss.backward(None);
351
352        // Optimizer step
353        optimizer.step(&mut [&mut weight, &mut bias]);
354        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
355
356        let loss_value = loss.value();
357        losses.push(loss_value);
358
359        // Check for convergence (loss < 0.01)
360        if loss_value < 0.01 && convergence_epoch == config.epochs {
361            convergence_epoch = epoch;
362        }
363    }
364
365    Ok(TrainingStats {
366        config,
367        final_loss: losses[losses.len() - 1],
368        loss_history: losses,
369        convergence_epoch,
370        weight_norm: weight.norm().value(),
371    })
372}
examples/optimizers/learning_rate_scheduling.rs (line 333)
319fn train_with_scheduler(
320    scheduler: &mut dyn LearningRateScheduler,
321    num_epochs: usize,
322) -> Result<TrainingStats, Box<dyn std::error::Error>> {
323    // Create training data: y = 2*x + 1
324    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
325    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
326
327    // Create model parameters
328    let mut weight = Tensor::randn(vec![1, 1], Some(456)).with_requires_grad();
329    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
330
331    // Create optimizer with initial learning rate
332    let mut optimizer = Adam::with_learning_rate(0.05);
333    optimizer.add_parameter(&weight);
334    optimizer.add_parameter(&bias);
335
336    // Training loop
337    let mut losses = Vec::new();
338    let mut lr_history = Vec::new();
339    let mut convergence_epoch = num_epochs;
340
341    for epoch in 0..num_epochs {
342        // Forward pass
343        let y_pred = x_data.matmul(&weight) + &bias;
344        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
345
346        // Backward pass
347        loss.backward(None);
348
349        // Update learning rate using scheduler
350        let current_lr = optimizer.learning_rate();
351        let new_lr = scheduler.step(current_lr, epoch, loss.value());
352
353        if (new_lr - current_lr).abs() > 1e-8 {
354            optimizer.set_learning_rate(new_lr);
355        }
356
357        // Optimizer step
358        optimizer.step(&mut [&mut weight, &mut bias]);
359        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
360
361        let loss_value = loss.value();
362        losses.push(loss_value);
363        lr_history.push(new_lr);
364
365        // Check for convergence
366        if loss_value < 0.01 && convergence_epoch == num_epochs {
367            convergence_epoch = epoch;
368        }
369    }
370
371    Ok(TrainingStats {
372        scheduler_name: scheduler.name().to_string(),
373        final_loss: losses[losses.len() - 1],
374        lr_history,
375        loss_history: losses,
376        convergence_epoch,
377    })
378}
examples/neural_networks/feedforward_network.rs (line 467)
431fn demonstrate_training_workflow() -> Result<(), Box<dyn std::error::Error>> {
432    println!("\n--- Training Workflow ---");
433
434    // Create a simple classification network
435    let config = FeedForwardConfig {
436        input_size: 2,
437        hidden_sizes: vec![4, 3],
438        output_size: 1,
439        use_bias: true,
440    };
441    let mut network = FeedForwardNetwork::new(config, Some(46));
442
443    println!("Training network: 2 -> [4, 3] -> 1");
444
445    // Create simple binary classification data: XOR problem
446    let x_data = Tensor::from_slice(
447        &[
448            0.0, 0.0, // -> 0
449            0.0, 1.0, // -> 1
450            1.0, 0.0, // -> 1
451            1.0, 1.0, // -> 0
452        ],
453        vec![4, 2],
454    )
455    .unwrap();
456
457    let y_true = Tensor::from_slice(&[0.0, 1.0, 1.0, 0.0], vec![4, 1]).unwrap();
458
459    println!("Training on XOR problem:");
460    println!("  Input shape: {:?}", x_data.shape().dims());
461    println!("  Target shape: {:?}", y_true.shape().dims());
462
463    // Create optimizer
464    let mut optimizer = Adam::with_learning_rate(0.1);
465    let params = network.parameters();
466    for param in &params {
467        optimizer.add_parameter(param);
468    }
469
470    // Training loop
471    let num_epochs = 50;
472    let mut losses = Vec::new();
473
474    for epoch in 0..num_epochs {
475        // Forward pass
476        let y_pred = network.forward(&x_data);
477
478        // Compute loss: MSE
479        let diff = y_pred.sub_tensor(&y_true);
480        let mut loss = diff.pow_scalar(2.0).mean();
481
482        // Backward pass
483        loss.backward(None);
484
485        // Optimizer step and zero grad
486        let mut params = network.parameters();
487        optimizer.step(&mut params);
488        optimizer.zero_grad(&mut params);
489
490        losses.push(loss.value());
491
492        // Print progress
493        if epoch % 10 == 0 || epoch == num_epochs - 1 {
494            println!("Epoch {:2}: Loss = {:.6}", epoch, loss.value());
495        }
496    }
497
498    // Test final model
499    let final_predictions = network.forward_no_grad(&x_data);
500    println!("\nFinal predictions vs targets:");
501    for i in 0..4 {
502        let pred = final_predictions.data()[i];
503        let target = y_true.data()[i];
504        let input_x = x_data.data()[i * 2];
505        let input_y = x_data.data()[i * 2 + 1];
506        println!(
507            "  [{:.0}, {:.0}] -> pred: {:.3}, target: {:.0}, error: {:.3}",
508            input_x,
509            input_y,
510            pred,
511            target,
512            (pred - target).abs()
513        );
514    }
515
516    Ok(())
517}
518
519/// Demonstrate comprehensive training with 100+ steps
520fn demonstrate_comprehensive_training() -> Result<(), Box<dyn std::error::Error>> {
521    println!("\n--- Comprehensive Training (100+ Steps) ---");
522
523    // Create a regression network
524    let config = FeedForwardConfig {
525        input_size: 3,
526        hidden_sizes: vec![8, 6, 4],
527        output_size: 2,
528        use_bias: true,
529    };
530    let mut network = FeedForwardNetwork::new(config, Some(47));
531
532    println!("Network architecture: 3 -> [8, 6, 4] -> 2");
533    println!("Total parameters: {}", network.parameter_count());
534
535    // Create synthetic regression data
536    // Target function: [y1, y2] = [x1 + 2*x2 - x3, x1*x2 + x3]
537    let num_samples = 32;
538    let mut x_vec = Vec::new();
539    let mut y_vec = Vec::new();
540
541    for i in 0..num_samples {
542        let x1 = (i as f32 / num_samples as f32) * 2.0 - 1.0; // [-1, 1]
543        let x2 = ((i * 2) as f32 / num_samples as f32) * 2.0 - 1.0;
544        let x3 = ((i * 3) as f32 / num_samples as f32) * 2.0 - 1.0;
545
546        let y1 = x1 + 2.0 * x2 - x3;
547        let y2 = x1 * x2 + x3;
548
549        x_vec.extend_from_slice(&[x1, x2, x3]);
550        y_vec.extend_from_slice(&[y1, y2]);
551    }
552
553    let x_data = Tensor::from_slice(&x_vec, vec![num_samples, 3]).unwrap();
554    let y_true = Tensor::from_slice(&y_vec, vec![num_samples, 2]).unwrap();
555
556    println!("Training data:");
557    println!("  {} samples", num_samples);
558    println!("  Input shape: {:?}", x_data.shape().dims());
559    println!("  Target shape: {:?}", y_true.shape().dims());
560
561    // Create optimizer with learning rate scheduling
562    let mut optimizer = Adam::with_learning_rate(0.01);
563    let params = network.parameters();
564    for param in &params {
565        optimizer.add_parameter(param);
566    }
567
568    // Comprehensive training loop (150 epochs)
569    let num_epochs = 150;
570    let mut losses = Vec::new();
571    let mut best_loss = f32::INFINITY;
572    let mut patience_counter = 0;
573    let patience = 20;
574
575    println!("Starting comprehensive training...");
576
577    for epoch in 0..num_epochs {
578        // Forward pass
579        let y_pred = network.forward(&x_data);
580
581        // Compute loss: MSE
582        let diff = y_pred.sub_tensor(&y_true);
583        let mut loss = diff.pow_scalar(2.0).mean();
584
585        // Backward pass
586        loss.backward(None);
587
588        // Optimizer step and zero grad
589        let mut params = network.parameters();
590        optimizer.step(&mut params);
591        optimizer.zero_grad(&mut params);
592
593        let current_loss = loss.value();
594        losses.push(current_loss);
595
596        // Learning rate scheduling
597        if epoch > 0 && epoch % 30 == 0 {
598            let new_lr = optimizer.learning_rate() * 0.8;
599            optimizer.set_learning_rate(new_lr);
600            println!("  Reduced learning rate to {:.4}", new_lr);
601        }
602
603        // Early stopping logic
604        if current_loss < best_loss {
605            best_loss = current_loss;
606            patience_counter = 0;
607        } else {
608            patience_counter += 1;
609        }
610
611        // Print progress
612        if epoch % 25 == 0 || epoch == num_epochs - 1 {
613            println!(
614                "Epoch {:3}: Loss = {:.6}, LR = {:.4}, Best = {:.6}",
615                epoch,
616                current_loss,
617                optimizer.learning_rate(),
618                best_loss
619            );
620        }
621
622        // Early stopping
623        if patience_counter >= patience && epoch > 50 {
624            println!("Early stopping at epoch {} (patience exceeded)", epoch);
625            break;
626        }
627    }
628
629    // Final evaluation
630    let final_predictions = network.forward_no_grad(&x_data);
631
632    // Compute final metrics
633    let final_loss = losses[losses.len() - 1];
634    let initial_loss = losses[0];
635    let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
636
637    println!("\nTraining completed!");
638    println!("  Initial loss: {:.6}", initial_loss);
639    println!("  Final loss: {:.6}", final_loss);
640    println!("  Best loss: {:.6}", best_loss);
641    println!("  Loss reduction: {:.1}%", loss_reduction);
642    println!("  Final learning rate: {:.4}", optimizer.learning_rate());
643
644    // Sample predictions analysis
645    println!("\nSample predictions (first 5):");
646    for i in 0..5.min(num_samples) {
647        let pred1 = final_predictions.data()[i * 2];
648        let pred2 = final_predictions.data()[i * 2 + 1];
649        let true1 = y_true.data()[i * 2];
650        let true2 = y_true.data()[i * 2 + 1];
651
652        println!(
653            "  Sample {}: pred=[{:.3}, {:.3}], true=[{:.3}, {:.3}], error=[{:.3}, {:.3}]",
654            i + 1,
655            pred1,
656            pred2,
657            true1,
658            true2,
659            (pred1 - true1).abs(),
660            (pred2 - true2).abs()
661        );
662    }
663
664    Ok(())
665}
666
667/// Demonstrate network serialization
668fn demonstrate_network_serialization() -> Result<(), Box<dyn std::error::Error>> {
669    println!("\n--- Network Serialization ---");
670
671    // Create and train a network
672    let config = FeedForwardConfig {
673        input_size: 2,
674        hidden_sizes: vec![4, 2],
675        output_size: 1,
676        use_bias: true,
677    };
678    let mut original_network = FeedForwardNetwork::new(config.clone(), Some(48));
679
680    // Quick training
681    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
682    let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
683
684    let mut optimizer = Adam::with_learning_rate(0.01);
685    let params = original_network.parameters();
686    for param in &params {
687        optimizer.add_parameter(param);
688    }
689
690    for _ in 0..20 {
691        let y_pred = original_network.forward(&x_data);
692        let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
693        loss.backward(None);
694
695        let mut params = original_network.parameters();
696        optimizer.step(&mut params);
697        optimizer.zero_grad(&mut params);
698    }
699
700    // Test original network
701    let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
702    let original_output = original_network.forward_no_grad(&test_input);
703
704    println!("Original network output: {:?}", original_output.data());
705
706    // Save network
707    original_network.save_json("temp_feedforward_network")?;
708
709    // Load network
710    let loaded_network = FeedForwardNetwork::load_json("temp_feedforward_network", config)?;
711    let loaded_output = loaded_network.forward_no_grad(&test_input);
712
713    println!("Loaded network output: {:?}", loaded_output.data());
714
715    // Verify consistency
716    let match_check = original_output
717        .data()
718        .iter()
719        .zip(loaded_output.data().iter())
720        .all(|(a, b)| (a - b).abs() < 1e-6);
721
722    println!(
723        "Serialization verification: {}",
724        if match_check { "PASSED" } else { "FAILED" }
725    );
726
727    Ok(())
728}
examples/neural_networks/basic_linear_layer.rs (line 256)
218fn demonstrate_training_loop() -> Result<(), Box<dyn std::error::Error>> {
219    println!("\n--- Training Loop ---");
220
221    // Create layer and training data
222    let mut layer = LinearLayer::new(2, 1, Some(45));
223
224    // Simple regression task: y = 2*x1 + 3*x2 + 1
225    let x_data = Tensor::from_slice(
226        &[
227            1.0, 1.0, // x1=1, x2=1 -> y=6
228            2.0, 1.0, // x1=2, x2=1 -> y=8
229            1.0, 2.0, // x1=1, x2=2 -> y=9
230            2.0, 2.0, // x1=2, x2=2 -> y=11
231        ],
232        vec![4, 2],
233    )
234    .unwrap();
235
236    let y_true = Tensor::from_slice(&[6.0, 8.0, 9.0, 11.0], vec![4, 1]).unwrap();
237
238    println!("Training data:");
239    println!("  X shape: {:?}", x_data.shape().dims());
240    println!("  Y shape: {:?}", y_true.shape().dims());
241    println!("  Target function: y = 2*x1 + 3*x2 + 1");
242
243    // Create optimizer
244    let config = AdamConfig {
245        learning_rate: 0.01,
246        beta1: 0.9,
247        beta2: 0.999,
248        eps: 1e-8,
249        weight_decay: 0.0,
250        amsgrad: false,
251    };
252
253    let mut optimizer = Adam::with_config(config);
254    let params = layer.parameters();
255    for param in &params {
256        optimizer.add_parameter(param);
257    }
258
259    println!("Optimizer setup complete. Starting training...");
260
261    // Training loop
262    let num_epochs = 100;
263    let mut losses = Vec::new();
264
265    for epoch in 0..num_epochs {
266        // Forward pass
267        let y_pred = layer.forward(&x_data);
268
269        // Compute loss: MSE
270        let diff = y_pred.sub_tensor(&y_true);
271        let mut loss = diff.pow_scalar(2.0).mean();
272
273        // Backward pass
274        loss.backward(None);
275
276        // Optimizer step
277        let mut params = layer.parameters();
278        optimizer.step(&mut params);
279        optimizer.zero_grad(&mut params);
280
281        losses.push(loss.value());
282
283        // Print progress
284        if epoch % 20 == 0 || epoch == num_epochs - 1 {
285            println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
286        }
287    }
288
289    // Evaluate final model
290    let final_predictions = layer.forward_no_grad(&x_data);
291
292    println!("\nFinal model evaluation:");
293    println!("  Learned weights: {:?}", layer.weight.data());
294    println!("  Learned bias: {:?}", layer.bias.data());
295    println!("  Target weights: [2.0, 3.0]");
296    println!("  Target bias: [1.0]");
297
298    println!("  Predictions vs True:");
299    for i in 0..4 {
300        let pred = final_predictions.data()[i];
301        let true_val = y_true.data()[i];
302        println!(
303            "    Sample {}: pred={:.3}, true={:.1}, error={:.3}",
304            i + 1,
305            pred,
306            true_val,
307            (pred - true_val).abs()
308        );
309    }
310
311    // Training analysis
312    let initial_loss = losses[0];
313    let final_loss = losses[losses.len() - 1];
314    let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
315
316    println!("\nTraining Analysis:");
317    println!("  Initial loss: {:.6}", initial_loss);
318    println!("  Final loss: {:.6}", final_loss);
319    println!("  Loss reduction: {:.1}%", loss_reduction);
320
321    Ok(())
322}
323
324/// Demonstrate single vs batch inference
325fn demonstrate_single_vs_batch_inference() {
326    println!("\n--- Single vs Batch Inference ---");
327
328    let layer = LinearLayer::new(4, 3, Some(46));
329
330    // Single inference
331    println!("Single inference:");
332    let single_input = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![1, 4]).unwrap();
333    let single_output = layer.forward_no_grad(&single_input);
334    println!("  Input shape: {:?}", single_input.shape().dims());
335    println!("  Output shape: {:?}", single_output.shape().dims());
336    println!("  Output: {:?}", single_output.data());
337
338    // Batch inference
339    println!("Batch inference:");
340    let batch_input = Tensor::from_slice(
341        &[
342            1.0, 2.0, 3.0, 4.0, // Sample 1
343            5.0, 6.0, 7.0, 8.0, // Sample 2
344            9.0, 10.0, 11.0, 12.0, // Sample 3
345        ],
346        vec![3, 4],
347    )
348    .unwrap();
349    let batch_output = layer.forward_no_grad(&batch_input);
350    println!("  Input shape: {:?}", batch_input.shape().dims());
351    println!("  Output shape: {:?}", batch_output.shape().dims());
352
353    // Verify batch consistency - first sample should match single inference
354    let _first_batch_sample = batch_output.view(vec![3, 3]); // Reshape to access first sample
355    let first_sample_data = &batch_output.data()[0..3]; // First 3 elements
356    let single_sample_data = single_output.data();
357
358    println!("Consistency check:");
359    println!("  Single output: {:?}", single_sample_data);
360    println!("  First batch sample: {:?}", first_sample_data);
361    println!(
362        "  Match: {}",
363        single_sample_data
364            .iter()
365            .zip(first_sample_data.iter())
366            .all(|(a, b)| (a - b).abs() < 1e-6)
367    );
368}
369
370/// Demonstrate serialization and loading
371fn demonstrate_serialization() -> Result<(), Box<dyn std::error::Error>> {
372    println!("\n--- Serialization ---");
373
374    // Create and train a simple layer
375    let mut original_layer = LinearLayer::new(2, 1, Some(47));
376
377    // Simple training data
378    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
379    let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
380
381    let mut optimizer = Adam::with_learning_rate(0.01);
382    let params = original_layer.parameters();
383    for param in &params {
384        optimizer.add_parameter(param);
385    }
386
387    // Train for a few epochs
388    for _ in 0..10 {
389        let y_pred = original_layer.forward(&x_data);
390        let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
391        loss.backward(None);
392
393        let mut params = original_layer.parameters();
394        optimizer.step(&mut params);
395        optimizer.zero_grad(&mut params);
396    }
397
398    println!("Original layer trained");
399    println!("  Weight: {:?}", original_layer.weight.data());
400    println!("  Bias: {:?}", original_layer.bias.data());
401
402    // Save layer
403    original_layer.save_json("temp_linear_layer")?;
404
405    // Load layer
406    let loaded_layer = LinearLayer::load_json("temp_linear_layer", 2, 1)?;
407
408    println!("Loaded layer");
409    println!("  Weight: {:?}", loaded_layer.weight.data());
410    println!("  Bias: {:?}", loaded_layer.bias.data());
411
412    // Verify consistency
413    let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
414    let original_output = original_layer.forward_no_grad(&test_input);
415    let loaded_output = loaded_layer.forward_no_grad(&test_input);
416
417    println!("Consistency check:");
418    println!("  Original output: {:?}", original_output.data());
419    println!("  Loaded output: {:?}", loaded_output.data());
420    println!(
421        "  Match: {}",
422        original_output
423            .data()
424            .iter()
425            .zip(loaded_output.data().iter())
426            .all(|(a, b)| (a - b).abs() < 1e-6)
427    );
428
429    println!("Serialization verification: PASSED");
430
431    Ok(())
432}
Source

pub fn add_parameters(&mut self, parameters: &[&Tensor])

Add multiple parameters to the optimizer

Links multiple parameters to the optimizer by creating parameter states indexed by each tensor’s ID. All parameters must have requires_grad set to true.

§Arguments
  • parameters - Slice of references to tensors to link
§Panics

Panics if any parameter does not have requires_grad set to true

Remove a parameter from the optimizer

Unlinks a parameter by removing its state from the optimizer. The parameter ID is used for identification.

§Arguments
  • parameter - Reference to the tensor to unlink
§Returns

True if the parameter was linked and removed, false if it was not linked

Source

pub fn clear_states(&mut self)

Remove all parameter states from the optimizer

Clears all parameter states, effectively unlinking all parameters. This is useful for resetting the optimizer or preparing for parameter re-linking.

Source

pub fn is_parameter_linked(&self, parameter: &Tensor) -> bool

Check if a parameter is linked to the optimizer

Returns true if the parameter has an associated state in the optimizer.

§Arguments
  • parameter - Reference to the tensor to check
§Returns

True if the parameter is linked, false otherwise

Source

pub fn parameter_count(&self) -> usize

Get the number of linked parameters

Returns the count of parameters currently linked to the optimizer.

§Returns

Number of linked parameters

Examples found in repository?
examples/getting_started/optimizer_basics.rs (line 78)
47fn demonstrate_basic_optimizer_setup() {
48    println!("--- Basic Optimizer Setup ---");
49
50    // Create parameters that require gradients
51    let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52    let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54    println!("Created parameters:");
55    println!(
56        "  Weight: shape {:?}, requires_grad: {}",
57        weight.shape().dims(),
58        weight.requires_grad()
59    );
60    println!(
61        "  Bias: shape {:?}, requires_grad: {}",
62        bias.shape().dims(),
63        bias.requires_grad()
64    );
65
66    // Create Adam optimizer with default configuration
67    let mut optimizer = Adam::new();
68    println!(
69        "Created Adam optimizer with learning rate: {}",
70        optimizer.learning_rate()
71    );
72
73    // Add parameters to optimizer
74    optimizer.add_parameter(&weight);
75    optimizer.add_parameter(&bias);
76    println!(
77        "Added {} parameters to optimizer",
78        optimizer.parameter_count()
79    );
80
81    // Create optimizer with custom configuration
82    let config = AdamConfig {
83        learning_rate: 0.01,
84        beta1: 0.9,
85        beta2: 0.999,
86        eps: 1e-8,
87        weight_decay: 0.0,
88        amsgrad: false,
89    };
90
91    let mut custom_optimizer = Adam::with_config(config);
92    custom_optimizer.add_parameter(&weight);
93    custom_optimizer.add_parameter(&bias);
94
95    println!(
96        "Created custom optimizer with learning rate: {}",
97        custom_optimizer.learning_rate()
98    );
99
100    // Demonstrate parameter linking
101    println!("Parameter linking completed successfully");
102}
More examples
Hide additional examples
examples/getting_started/serialization_basics.rs (line 131)
109fn demonstrate_optimizer_serialization() -> Result<(), Box<dyn std::error::Error>> {
110    println!("\n--- Optimizer Serialization ---");
111
112    // Create an optimizer with some parameters
113    let mut weight = Tensor::randn(vec![2, 2], Some(42)).with_requires_grad();
114    let mut bias = Tensor::randn(vec![2], Some(43)).with_requires_grad();
115
116    let config = AdamConfig {
117        learning_rate: 0.001,
118        beta1: 0.9,
119        beta2: 0.999,
120        eps: 1e-8,
121        weight_decay: 0.0,
122        amsgrad: false,
123    };
124
125    let mut optimizer = Adam::with_config(config);
126    optimizer.add_parameter(&weight);
127    optimizer.add_parameter(&bias);
128
129    println!(
130        "Created optimizer with {} parameters",
131        optimizer.parameter_count()
132    );
133    println!("Learning rate: {}", optimizer.learning_rate());
134
135    // Simulate some training steps
136    for _ in 0..3 {
137        let mut loss = weight.sum() + bias.sum();
138        loss.backward(None);
139        optimizer.step(&mut [&mut weight, &mut bias]);
140        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
141    }
142
143    // Save optimizer state
144    let optimizer_path = "temp_optimizer.json";
145    optimizer.save_json(optimizer_path)?;
146    println!("Saved optimizer to: {}", optimizer_path);
147
148    // Load optimizer state
149    let loaded_optimizer = Adam::load_json(optimizer_path)?;
150    println!(
151        "Loaded optimizer with {} parameters",
152        loaded_optimizer.parameter_count()
153    );
154    println!("Learning rate: {}", loaded_optimizer.learning_rate());
155
156    // Verify optimizer state
157    assert_eq!(
158        optimizer.parameter_count(),
159        loaded_optimizer.parameter_count()
160    );
161    assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
162    println!("Optimizer serialization verification: PASSED");
163
164    Ok(())
165}

Re-link parameters to saved optimizer states in chronological order

After deserializing an optimizer, use this method to restore saved parameter states to new tensors. Parameters must be provided in the same chronological order they were originally added to the optimizer. Shape validation ensures parameter compatibility.

§Arguments
  • parameters - Slice of parameter references in chronological order
§Returns

Result indicating success or failure with detailed error message

§Panics

Panics if any parameter does not have requires_grad set to true

Source

pub fn config(&self) -> &AdamConfig

Get the current optimizer configuration

Returns a reference to the current configuration, allowing inspection of all hyperparameters without modification.

§Returns

Reference to the current Adam configuration

Trait Implementations§

Source§

impl Default for Adam

Source§

fn default() -> Self

Returns the “default value” for a type. Read more
Source§

impl FromFieldValue for Adam

Source§

fn from_field_value( value: FieldValue, field_name: &str, ) -> SerializationResult<Self>

Create Adam from FieldValue

§Arguments
  • value - FieldValue containing optimizer data
  • field_name - Name of the field being deserialized (for error messages)
§Returns

Reconstructed Adam instance or error if deserialization fails

Source§

impl Optimizer for Adam

Source§

fn step(&mut self, parameters: &mut [&mut Tensor])

Perform a single optimization step

Updates all provided parameters based on their accumulated gradients using the Adam algorithm. Each parameter is updated according to the Adam update rule with bias correction and optional AMSGrad variant if enabled. All parameters must be linked to the optimizer before calling this method.

§Arguments
  • parameters - Mutable slice of parameter references to update
§Thread Safety

This method is thread-safe as it takes mutable references to parameters, ensuring exclusive access during updates.

§Performance
  • Uses SIMD optimization (AVX2) when available for 8x vectorization
  • Processes parameters in sequence for optimal cache usage
  • Maintains per-parameter state for momentum and velocity estimates
§Panics

Panics if any parameter is not linked to the optimizer

Source§

fn zero_grad(&mut self, parameters: &mut [&mut Tensor])

Zero out all parameter gradients

Clears accumulated gradients for all provided parameters. This should be called before each backward pass to prevent gradient accumulation across multiple forward/backward passes. Also clears the global autograd gradient map.

§Arguments
  • parameters - Mutable slice of parameter references to clear gradients for
§Performance
  • Efficiently clears gradients using optimized tensor operations
  • Clears both per-tensor gradients and global autograd state
  • Thread-safe as it takes mutable references to parameters
Source§

fn learning_rate(&self) -> f32

Get the current learning rate

Returns the current learning rate used for parameter updates.

§Returns

Current learning rate as f32

Source§

fn set_learning_rate(&mut self, lr: f32)

Set the learning rate for all parameters

Updates the learning rate for all parameters in the optimizer. This allows dynamic learning rate scheduling during training.

§Arguments
  • lr - New learning rate value
Source§

impl Serializable for Adam

Source§

fn to_json(&self) -> SerializationResult<String>

Serialize the Adam optimizer to JSON format

This method converts the Adam optimizer into a human-readable JSON string representation that includes all optimizer state, configuration, parameter states, and step counts. The JSON format is suitable for debugging, configuration files, and cross-language interoperability.

§Returns

JSON string representation of the optimizer on success, or SerializationError on failure

§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;

let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);

let json = optimizer.to_json().unwrap();
assert!(!json.is_empty());
Source§

fn from_json(json: &str) -> SerializationResult<Self>

Deserialize an Adam optimizer from JSON format

This method parses a JSON string and reconstructs an Adam optimizer with all saved state. Parameters must be re-linked after deserialization using add_parameter or relink_parameters.

§Arguments
  • json - JSON string containing serialized optimizer
§Returns

The deserialized optimizer on success, or SerializationError on failure

§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;

let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);

let json = optimizer.to_json().unwrap();
let loaded_optimizer = Adam::from_json(&json).unwrap();
assert_eq!(loaded_optimizer.saved_parameter_count(), 1);
Source§

fn to_binary(&self) -> SerializationResult<Vec<u8>>

Serialize the Adam optimizer to binary format

This method converts the optimizer into a compact binary representation optimized for storage and transmission. The binary format provides maximum performance and minimal file sizes compared to JSON.

§Returns

Binary representation of the optimizer on success, or SerializationError on failure

§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;

let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);

let binary = optimizer.to_binary().unwrap();
assert!(!binary.is_empty());
Source§

fn from_binary(data: &[u8]) -> SerializationResult<Self>

Deserialize an Adam optimizer from binary format

This method parses binary data and reconstructs an Adam optimizer with all saved state. Parameters must be re-linked after deserialization using add_parameter or relink_parameters.

§Arguments
  • data - Binary data containing serialized optimizer
§Returns

The deserialized optimizer on success, or SerializationError on failure

§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;

let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);

let binary = optimizer.to_binary().unwrap();
let loaded_optimizer = Adam::from_binary(&binary).unwrap();
assert_eq!(loaded_optimizer.saved_parameter_count(), 1);
Source§

fn save<P: AsRef<Path>>( &self, path: P, format: Format, ) -> SerializationResult<()>

Save the object to a file in the specified format Read more
Source§

fn save_to_writer<W: Write>( &self, writer: &mut W, format: Format, ) -> SerializationResult<()>

Save the object to a writer in the specified format Read more
Source§

fn load<P: AsRef<Path>>(path: P, format: Format) -> SerializationResult<Self>

Load an object from a file in the specified format Read more
Source§

fn load_from_reader<R: Read>( reader: &mut R, format: Format, ) -> SerializationResult<Self>

Load an object from a reader in the specified format Read more
Source§

impl StructSerializable for Adam

Source§

fn to_serializer(&self) -> StructSerializer

Convert Adam to StructSerializer for serialization

Serializes all optimizer state including configuration, parameter states, and global step count. Parameter linking is not serialized and must be done after deserialization.

§Returns

StructSerializer containing all serializable optimizer state

Source§

fn from_deserializer( deserializer: &mut StructDeserializer, ) -> SerializationResult<Self>

Create Adam from StructDeserializer

Reconstructs Adam optimizer from serialized state. Parameters must be linked separately using add_parameter or add_parameters.

§Arguments
  • deserializer - StructDeserializer containing optimizer data
§Returns

Reconstructed Adam instance without parameter links, or error if deserialization fails

Source§

fn save_json<P: AsRef<Path>>(&self, path: P) -> SerializationResult<()>

Saves the struct to a JSON file Read more
Source§

fn save_binary<P: AsRef<Path>>(&self, path: P) -> SerializationResult<()>

Saves the struct to a binary file Read more
Source§

fn load_json<P: AsRef<Path>>(path: P) -> SerializationResult<Self>

Loads the struct from a JSON file Read more
Source§

fn load_binary<P: AsRef<Path>>(path: P) -> SerializationResult<Self>

Loads the struct from a binary file Read more
Source§

fn to_json(&self) -> SerializationResult<String>

Converts the struct to a JSON string Read more
Source§

fn to_binary(&self) -> SerializationResult<Vec<u8>>

Converts the struct to binary data Read more
Source§

fn from_json(json: &str) -> SerializationResult<Self>

Creates the struct from a JSON string Read more
Source§

fn from_binary(data: &[u8]) -> SerializationResult<Self>

Creates the struct from binary data Read more

Auto Trait Implementations§

§

impl Freeze for Adam

§

impl RefUnwindSafe for Adam

§

impl Send for Adam

§

impl Sync for Adam

§

impl Unpin for Adam

§

impl UnwindSafe for Adam

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToFieldValue for T
where T: Serializable,

Source§

fn to_field_value(&self) -> FieldValue

Converts the value to a FieldValue for serialization Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.