Adam

Struct Adam 

Source
pub struct Adam { /* private fields */ }
Expand description

Adam optimizer for neural network parameter optimization

Implements the Adam optimization algorithm with PyTorch-compatible interface. Provides adaptive learning rates with momentum for efficient training of neural networks. The optimizer maintains per-parameter state for momentum and velocity estimates, enabling adaptive learning rates that improve convergence across diverse architectures.

§Usage Pattern

The optimizer uses ID-based parameter linking for maximum flexibility and thread safety:

  • Parameters are linked to the optimizer via add_parameter or add_parameters
  • The step method takes mutable references to parameters for thread-safe updates
  • Parameter states are maintained by tensor ID, allowing for dynamic parameter management
  • Supports serialization and deserialization with parameter re-linking

§Dynamic Parameter Management

Parameters can be added, removed, or re-linked at runtime:

  • add_parameter: Link a single parameter
  • add_parameters: Link multiple parameters at once
  • unlink_parameter: Remove parameter state by ID
  • clear_states: Remove all parameter states
  • is_parameter_linked: Check if a parameter is linked

§Serialization Support

The optimizer supports full serialization and deserialization with state preservation:

  • Parameter states are saved with their shapes and insertion order for validation
  • After deserialization, use relink_parameters to restore saved states to new tensors
  • Parameters must be re-linked in the same chronological order they were originally added
  • Shape validation ensures consistency between saved and current parameters

§Features

  • ID-Based Parameter Linking: Dynamic parameter management via tensor IDs
  • Thread-Safe Step Method: Takes mutable references for safe concurrent access
  • Per-Parameter State: Each parameter maintains its own momentum and velocity buffers
  • Bias Correction: Automatically corrects initialization bias in moment estimates
  • Weight Decay: Optional L2 regularization with efficient implementation
  • AMSGrad Support: Optional AMSGrad variant for improved convergence stability
  • SIMD Optimization: AVX2-optimized updates for maximum performance
  • Full Serialization: Complete state persistence and restoration

§Thread Safety

This type is thread-safe and can be shared between threads. The step method takes mutable references to parameters, ensuring exclusive access during updates.

Implementations§

Source§

impl Adam

Source

pub fn saved_parameter_count(&self) -> usize

Get the number of saved parameter states for checkpoint validation

This method returns the count of parameter states currently stored in the optimizer, which is essential for validating checkpoint integrity and ensuring proper parameter re-linking after deserialization. The count includes all parameters that have been linked to the optimizer and have accumulated optimization state.

§Returns

Number of parameter states currently stored in the optimizer

§Usage Patterns
§Checkpoint Validation

After deserializing an optimizer, this method helps verify that the expected number of parameters were saved and can guide the re-linking process.

§Training Resumption

When resuming training, compare this count with the number of parameters in your model to ensure checkpoint compatibility.

§State Management

Use this method to monitor optimizer state growth and memory usage during training with dynamic parameter addition.

§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;

let weight = Tensor::ones(vec![10, 5]).with_requires_grad();
let bias = Tensor::zeros(vec![5]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);
optimizer.add_parameter(&bias);

// Check parameter count before serialization
assert_eq!(optimizer.saved_parameter_count(), 2);

// Serialize and deserialize
let json = optimizer.to_json().unwrap();
let loaded_optimizer = Adam::from_json(&json).unwrap();

// Verify parameter count is preserved
assert_eq!(loaded_optimizer.saved_parameter_count(), 2);
§Performance
  • Time Complexity: O(1) - Direct access to internal state count
  • Memory Usage: No additional memory allocation
  • Thread Safety: Safe to call from multiple threads concurrently
Source§

impl Adam

Source

pub fn new() -> Self

Create a new Adam optimizer with default configuration

Initializes an Adam optimizer with PyTorch-compatible default hyperparameters. Parameters must be linked separately using add_parameter or add_parameters.

§Returns

A new Adam optimizer instance with default hyperparameters

Examples found in repository?
examples/getting_started/optimizer_basics.rs (line 67)
47fn demonstrate_basic_optimizer_setup() {
48    println!("--- Basic Optimizer Setup ---");
49
50    // Create parameters that require gradients
51    let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52    let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54    println!("Created parameters:");
55    println!(
56        "  Weight: shape {:?}, requires_grad: {}",
57        weight.shape().dims,
58        weight.requires_grad()
59    );
60    println!(
61        "  Bias: shape {:?}, requires_grad: {}",
62        bias.shape().dims,
63        bias.requires_grad()
64    );
65
66    // Create Adam optimizer with default configuration
67    let mut optimizer = Adam::new();
68    println!(
69        "Created Adam optimizer with learning rate: {}",
70        optimizer.learning_rate()
71    );
72
73    // Add parameters to optimizer
74    optimizer.add_parameter(&weight);
75    optimizer.add_parameter(&bias);
76    println!(
77        "Added {} parameters to optimizer",
78        optimizer.parameter_count()
79    );
80
81    // Create optimizer with custom configuration
82    let config = AdamConfig {
83        learning_rate: 0.01,
84        beta1: 0.9,
85        beta2: 0.999,
86        eps: 1e-8,
87        weight_decay: 0.0,
88        amsgrad: false,
89    };
90
91    let mut custom_optimizer = Adam::with_config(config);
92    custom_optimizer.add_parameter(&weight);
93    custom_optimizer.add_parameter(&bias);
94
95    println!(
96        "Created custom optimizer with learning rate: {}",
97        custom_optimizer.learning_rate()
98    );
99
100    // Demonstrate parameter linking
101    println!("Parameter linking completed successfully");
102}
More examples
Hide additional examples
examples/optimizers/adam_configurations.rs (line 96)
84fn demonstrate_default_adam() -> Result<(), Box<dyn std::error::Error>> {
85    println!("--- Default Adam Configuration ---");
86
87    // Create a simple regression problem: y = 2*x + 1
88    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
89    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
90
91    // Create model parameters
92    let mut weight = Tensor::randn(vec![1, 1], Some(42)).with_requires_grad();
93    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
94
95    // Create Adam optimizer with default configuration
96    let mut optimizer = Adam::new();
97    optimizer.add_parameter(&weight);
98    optimizer.add_parameter(&bias);
99
100    println!("Default Adam configuration:");
101    println!("  Learning rate: {}", optimizer.learning_rate());
102    println!("  Initial weight: {:.6}", weight.value());
103    println!("  Initial bias: {:.6}", bias.value());
104
105    // Training loop
106    let num_epochs = 50;
107    let mut losses = Vec::new();
108
109    for epoch in 0..num_epochs {
110        // Forward pass
111        let y_pred = x_data.matmul(&weight) + &bias;
112        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
113
114        // Backward pass
115        loss.backward(None);
116
117        // Optimizer step
118        optimizer.step(&mut [&mut weight, &mut bias]);
119        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
120
121        losses.push(loss.value());
122
123        if epoch % 10 == 0 || epoch == num_epochs - 1 {
124            println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
125        }
126    }
127
128    // Evaluate final model
129    let _final_predictions = x_data.matmul(&weight) + &bias;
130    println!("\nFinal model:");
131    println!("  Learned weight: {:.6} (target: 2.0)", weight.value());
132    println!("  Learned bias: {:.6} (target: 1.0)", bias.value());
133    println!("  Final loss: {:.6}", losses[losses.len() - 1]);
134
135    Ok(())
136}
Source

pub fn with_config(config: AdamConfig) -> Self

Create a new Adam optimizer with custom configuration

Allows full control over all Adam hyperparameters for specialized training scenarios such as fine-tuning, transfer learning, or research applications. Parameters must be linked separately using add_parameter or add_parameters.

§Arguments
  • config - Adam configuration with custom hyperparameters
§Returns

A new Adam optimizer instance with the specified configuration

Examples found in repository?
examples/getting_started/optimizer_basics.rs (line 91)
47fn demonstrate_basic_optimizer_setup() {
48    println!("--- Basic Optimizer Setup ---");
49
50    // Create parameters that require gradients
51    let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52    let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54    println!("Created parameters:");
55    println!(
56        "  Weight: shape {:?}, requires_grad: {}",
57        weight.shape().dims,
58        weight.requires_grad()
59    );
60    println!(
61        "  Bias: shape {:?}, requires_grad: {}",
62        bias.shape().dims,
63        bias.requires_grad()
64    );
65
66    // Create Adam optimizer with default configuration
67    let mut optimizer = Adam::new();
68    println!(
69        "Created Adam optimizer with learning rate: {}",
70        optimizer.learning_rate()
71    );
72
73    // Add parameters to optimizer
74    optimizer.add_parameter(&weight);
75    optimizer.add_parameter(&bias);
76    println!(
77        "Added {} parameters to optimizer",
78        optimizer.parameter_count()
79    );
80
81    // Create optimizer with custom configuration
82    let config = AdamConfig {
83        learning_rate: 0.01,
84        beta1: 0.9,
85        beta2: 0.999,
86        eps: 1e-8,
87        weight_decay: 0.0,
88        amsgrad: false,
89    };
90
91    let mut custom_optimizer = Adam::with_config(config);
92    custom_optimizer.add_parameter(&weight);
93    custom_optimizer.add_parameter(&bias);
94
95    println!(
96        "Created custom optimizer with learning rate: {}",
97        custom_optimizer.learning_rate()
98    );
99
100    // Demonstrate parameter linking
101    println!("Parameter linking completed successfully");
102}
More examples
Hide additional examples
examples/getting_started/serialization_basics.rs (line 125)
109fn demonstrate_optimizer_serialization() -> Result<(), Box<dyn std::error::Error>> {
110    println!("\n--- Optimizer Serialization ---");
111
112    // Create an optimizer with some parameters
113    let mut weight = Tensor::randn(vec![2, 2], Some(42)).with_requires_grad();
114    let mut bias = Tensor::randn(vec![2], Some(43)).with_requires_grad();
115
116    let config = AdamConfig {
117        learning_rate: 0.001,
118        beta1: 0.9,
119        beta2: 0.999,
120        eps: 1e-8,
121        weight_decay: 0.0,
122        amsgrad: false,
123    };
124
125    let mut optimizer = Adam::with_config(config);
126    optimizer.add_parameter(&weight);
127    optimizer.add_parameter(&bias);
128
129    println!(
130        "Created optimizer with {} parameters",
131        optimizer.parameter_count()
132    );
133    println!("Learning rate: {}", optimizer.learning_rate());
134
135    // Simulate some training steps
136    for _ in 0..3 {
137        let mut loss = weight.sum() + bias.sum();
138        loss.backward(None);
139        optimizer.step(&mut [&mut weight, &mut bias]);
140        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
141    }
142
143    // Save optimizer state
144    let optimizer_path = "temp_optimizer.json";
145    optimizer.save_json(optimizer_path)?;
146    println!("Saved optimizer to: {}", optimizer_path);
147
148    // Load optimizer state
149    let loaded_optimizer = Adam::load_json(optimizer_path)?;
150    println!(
151        "Loaded optimizer with {} parameters",
152        loaded_optimizer.parameter_count()
153    );
154    println!("Learning rate: {}", loaded_optimizer.learning_rate());
155
156    // Verify optimizer state
157    assert_eq!(
158        optimizer.parameter_count(),
159        loaded_optimizer.parameter_count()
160    );
161    assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
162    println!("Optimizer serialization verification: PASSED");
163
164    Ok(())
165}
examples/optimizers/adam_configurations.rs (line 336)
317fn train_with_config(config: TrainingConfig) -> Result<TrainingStats, Box<dyn std::error::Error>> {
318    // Create training data
319    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
320    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
321
322    // Create model parameters
323    let mut weight = Tensor::randn(vec![1, 1], Some(123)).with_requires_grad();
324    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
325
326    // Create optimizer with custom configuration
327    let adam_config = AdamConfig {
328        learning_rate: config.learning_rate,
329        beta1: config.beta1,
330        beta2: config.beta2,
331        eps: 1e-8,
332        weight_decay: config.weight_decay,
333        amsgrad: false,
334    };
335
336    let mut optimizer = Adam::with_config(adam_config);
337    optimizer.add_parameter(&weight);
338    optimizer.add_parameter(&bias);
339
340    // Training loop
341    let mut losses = Vec::new();
342    let mut convergence_epoch = config.epochs;
343
344    for epoch in 0..config.epochs {
345        // Forward pass
346        let y_pred = x_data.matmul(&weight) + &bias;
347        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
348
349        // Backward pass
350        loss.backward(None);
351
352        // Optimizer step
353        optimizer.step(&mut [&mut weight, &mut bias]);
354        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
355
356        let loss_value = loss.value();
357        losses.push(loss_value);
358
359        // Check for convergence (loss < 0.01)
360        if loss_value < 0.01 && convergence_epoch == config.epochs {
361            convergence_epoch = epoch;
362        }
363    }
364
365    Ok(TrainingStats {
366        config,
367        final_loss: losses[losses.len() - 1],
368        loss_history: losses,
369        convergence_epoch,
370        weight_norm: weight.norm().value(),
371    })
372}
examples/neural_networks/basic_linear_layer.rs (line 252)
217fn demonstrate_training_loop() -> Result<(), Box<dyn std::error::Error>> {
218    println!("\n--- Training Loop ---");
219
220    // Create layer and training data
221    let mut layer = LinearLayer::new(2, 1, Some(45));
222
223    // Simple regression task: y = 2*x1 + 3*x2 + 1
224    let x_data = Tensor::from_slice(
225        &[
226            1.0, 1.0, // x1=1, x2=1 -> y=6
227            2.0, 1.0, // x1=2, x2=1 -> y=8
228            1.0, 2.0, // x1=1, x2=2 -> y=9
229            2.0, 2.0, // x1=2, x2=2 -> y=11
230        ],
231        vec![4, 2],
232    )
233    .unwrap();
234
235    let y_true = Tensor::from_slice(&[6.0, 8.0, 9.0, 11.0], vec![4, 1]).unwrap();
236
237    println!("Training data:");
238    println!("  X shape: {:?}", x_data.shape().dims);
239    println!("  Y shape: {:?}", y_true.shape().dims);
240    println!("  Target function: y = 2*x1 + 3*x2 + 1");
241
242    // Create optimizer
243    let config = AdamConfig {
244        learning_rate: 0.01,
245        beta1: 0.9,
246        beta2: 0.999,
247        eps: 1e-8,
248        weight_decay: 0.0,
249        amsgrad: false,
250    };
251
252    let mut optimizer = Adam::with_config(config);
253    let params = layer.parameters();
254    for param in &params {
255        optimizer.add_parameter(param);
256    }
257
258    println!("Optimizer setup complete. Starting training...");
259
260    // Training loop
261    let num_epochs = 100;
262    let mut losses = Vec::new();
263
264    for epoch in 0..num_epochs {
265        // Forward pass
266        let y_pred = layer.forward(&x_data);
267
268        // Compute loss: MSE
269        let diff = y_pred.sub_tensor(&y_true);
270        let mut loss = diff.pow_scalar(2.0).mean();
271
272        // Backward pass
273        loss.backward(None);
274
275        // Optimizer step
276        let mut params = layer.parameters();
277        optimizer.step(&mut params);
278        optimizer.zero_grad(&mut params);
279
280        losses.push(loss.value());
281
282        // Print progress
283        if epoch % 20 == 0 || epoch == num_epochs - 1 {
284            println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
285        }
286    }
287
288    // Evaluate final model
289    let final_predictions = layer.forward_no_grad(&x_data);
290
291    println!("\nFinal model evaluation:");
292    println!("  Learned weights: {:?}", layer.weight.data());
293    println!("  Learned bias: {:?}", layer.bias.data());
294    println!("  Target weights: [2.0, 3.0]");
295    println!("  Target bias: [1.0]");
296
297    println!("  Predictions vs True:");
298    for i in 0..4 {
299        let pred = final_predictions.data()[i];
300        let true_val = y_true.data()[i];
301        println!(
302            "    Sample {}: pred={:.3}, true={:.1}, error={:.3}",
303            i + 1,
304            pred,
305            true_val,
306            (pred - true_val).abs()
307        );
308    }
309
310    // Training analysis
311    let initial_loss = losses[0];
312    let final_loss = losses[losses.len() - 1];
313    let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
314
315    println!("\nTraining Analysis:");
316    println!("  Initial loss: {:.6}", initial_loss);
317    println!("  Final loss: {:.6}", final_loss);
318    println!("  Loss reduction: {:.1}%", loss_reduction);
319
320    Ok(())
321}
Source

pub fn with_learning_rate(learning_rate: f32) -> Self

Create a new Adam optimizer with custom learning rate

A convenience constructor that allows setting only the learning rate while using default values for all other hyperparameters. Parameters must be linked separately using add_parameter or add_parameters.

§Arguments
  • learning_rate - Learning rate for optimization
§Returns

A new Adam optimizer instance with the specified learning rate and default values for all other hyperparameters

Examples found in repository?
examples/optimizers/learning_rate_scheduling.rs (line 332)
319fn train_with_scheduler(
320    scheduler: &mut dyn LearningRateScheduler,
321    num_epochs: usize,
322) -> Result<TrainingStats, Box<dyn std::error::Error>> {
323    // Create training data: y = 2*x + 1
324    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
325    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
326
327    // Create model parameters
328    let mut weight = Tensor::randn(vec![1, 1], Some(456)).with_requires_grad();
329    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
330
331    // Create optimizer with initial learning rate
332    let mut optimizer = Adam::with_learning_rate(0.05);
333    optimizer.add_parameter(&weight);
334    optimizer.add_parameter(&bias);
335
336    // Training loop
337    let mut losses = Vec::new();
338    let mut lr_history = Vec::new();
339    let mut convergence_epoch = num_epochs;
340
341    for epoch in 0..num_epochs {
342        // Forward pass
343        let y_pred = x_data.matmul(&weight) + &bias;
344        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
345
346        // Backward pass
347        loss.backward(None);
348
349        // Update learning rate using scheduler
350        let current_lr = optimizer.learning_rate();
351        let new_lr = scheduler.step(current_lr, epoch, loss.value());
352
353        if (new_lr - current_lr).abs() > 1e-8 {
354            optimizer.set_learning_rate(new_lr);
355        }
356
357        // Optimizer step
358        optimizer.step(&mut [&mut weight, &mut bias]);
359        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
360
361        let loss_value = loss.value();
362        losses.push(loss_value);
363        lr_history.push(new_lr);
364
365        // Check for convergence
366        if loss_value < 0.01 && convergence_epoch == num_epochs {
367            convergence_epoch = epoch;
368        }
369    }
370
371    Ok(TrainingStats {
372        scheduler_name: scheduler.name().to_string(),
373        final_loss: losses[losses.len() - 1],
374        lr_history,
375        loss_history: losses,
376        convergence_epoch,
377    })
378}
More examples
Hide additional examples
examples/neural_networks/basic_linear_layer.rs (line 380)
370fn demonstrate_serialization() -> Result<(), Box<dyn std::error::Error>> {
371    println!("\n--- Serialization ---");
372
373    // Create and train a simple layer
374    let mut original_layer = LinearLayer::new(2, 1, Some(47));
375
376    // Simple training data
377    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
378    let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
379
380    let mut optimizer = Adam::with_learning_rate(0.01);
381    let params = original_layer.parameters();
382    for param in &params {
383        optimizer.add_parameter(param);
384    }
385
386    // Train for a few epochs
387    for _ in 0..10 {
388        let y_pred = original_layer.forward(&x_data);
389        let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
390        loss.backward(None);
391
392        let mut params = original_layer.parameters();
393        optimizer.step(&mut params);
394        optimizer.zero_grad(&mut params);
395    }
396
397    println!("Original layer trained");
398    println!("  Weight: {:?}", original_layer.weight.data());
399    println!("  Bias: {:?}", original_layer.bias.data());
400
401    // Save layer
402    original_layer.save_json("temp_linear_layer")?;
403
404    // Load layer
405    let loaded_layer = LinearLayer::load_json("temp_linear_layer", 2, 1)?;
406
407    println!("Loaded layer");
408    println!("  Weight: {:?}", loaded_layer.weight.data());
409    println!("  Bias: {:?}", loaded_layer.bias.data());
410
411    // Verify consistency
412    let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
413    let original_output = original_layer.forward_no_grad(&test_input);
414    let loaded_output = loaded_layer.forward_no_grad(&test_input);
415
416    println!("Consistency check:");
417    println!("  Original output: {:?}", original_output.data());
418    println!("  Loaded output: {:?}", loaded_output.data());
419    println!(
420        "  Match: {}",
421        original_output
422            .data()
423            .iter()
424            .zip(loaded_output.data().iter())
425            .all(|(a, b)| (a - b).abs() < 1e-6)
426    );
427
428    println!("Serialization verification: PASSED");
429
430    Ok(())
431}
examples/getting_started/serialization_basics.rs (line 212)
204fn demonstrate_model_checkpointing() -> Result<(), Box<dyn std::error::Error>> {
205    println!("\n--- Model Checkpointing ---");
206
207    // Create a simple model (weights and bias)
208    let mut weights = Tensor::randn(vec![2, 1], Some(45)).with_requires_grad();
209    let mut bias = Tensor::randn(vec![1], Some(46)).with_requires_grad();
210
211    // Create optimizer
212    let mut optimizer = Adam::with_learning_rate(0.01);
213    optimizer.add_parameter(&weights);
214    optimizer.add_parameter(&bias);
215
216    println!("Initial weights: {:?}", weights.data());
217    println!("Initial bias: {:?}", bias.data());
218
219    // Simulate training
220    for epoch in 0..5 {
221        let mut loss = weights.sum() + bias.sum();
222        loss.backward(None);
223        optimizer.step(&mut [&mut weights, &mut bias]);
224        optimizer.zero_grad(&mut [&mut weights, &mut bias]);
225
226        if epoch % 2 == 0 {
227            // Save checkpoint
228            let checkpoint_dir = format!("checkpoint_epoch_{}", epoch);
229            fs::create_dir_all(&checkpoint_dir)?;
230
231            weights.save_json(format!("{}/weights.json", checkpoint_dir))?;
232            bias.save_json(format!("{}/bias.json", checkpoint_dir))?;
233            optimizer.save_json(format!("{}/optimizer.json", checkpoint_dir))?;
234
235            println!("Saved checkpoint for epoch {}", epoch);
236        }
237    }
238
239    // Load from checkpoint
240    let loaded_weights = Tensor::load_json("checkpoint_epoch_4/weights.json")?;
241    let loaded_bias = Tensor::load_json("checkpoint_epoch_4/bias.json")?;
242    let loaded_optimizer = Adam::load_json("checkpoint_epoch_4/optimizer.json")?;
243
244    println!("Loaded weights: {:?}", loaded_weights.data());
245    println!("Loaded bias: {:?}", loaded_bias.data());
246    println!(
247        "Loaded optimizer learning rate: {}",
248        loaded_optimizer.learning_rate()
249    );
250
251    // Verify checkpoint integrity
252    assert_eq!(weights.shape().dims, loaded_weights.shape().dims);
253    assert_eq!(bias.shape().dims, loaded_bias.shape().dims);
254    assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
255
256    println!("Checkpointing verification: PASSED");
257
258    Ok(())
259}
examples/getting_started/optimizer_basics.rs (line 113)
105fn demonstrate_linear_regression() -> Result<(), Box<dyn std::error::Error>> {
106    println!("\n--- Linear Regression Training ---");
107
108    // Create model parameters
109    let mut weight = Tensor::randn(vec![1, 1], Some(43)).with_requires_grad();
110    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
111
112    // Create optimizer
113    let mut optimizer = Adam::with_learning_rate(0.01);
114    optimizer.add_parameter(&weight);
115    optimizer.add_parameter(&bias);
116
117    // Create simple training data: y = 2*x + 1
118    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
119    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
120
121    println!("Training data:");
122    println!("  X: {:?}", x_data.data());
123    println!("  Y: {:?}", y_true.data());
124    println!("  Target: y = 2*x + 1");
125
126    // Training loop
127    let num_epochs = 100;
128    let mut losses = Vec::new();
129
130    for epoch in 0..num_epochs {
131        // Forward pass: y_pred = x * weight + bias
132        let y_pred = x_data.matmul(&weight) + &bias;
133
134        // Compute loss: MSE
135        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
136
137        // Backward pass
138        loss.backward(None);
139
140        // Optimizer step
141        optimizer.step(&mut [&mut weight, &mut bias]);
142        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
143
144        losses.push(loss.value());
145
146        // Print progress every 20 epochs
147        if epoch % 20 == 0 || epoch == num_epochs - 1 {
148            println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
149        }
150    }
151
152    // Evaluate final model
153    let final_predictions = x_data.matmul(&weight) + &bias;
154    println!("\nFinal model evaluation:");
155    println!("  Learned weight: {:.6}", weight.value());
156    println!("  Learned bias: {:.6}", bias.value());
157    println!("  Predictions vs True:");
158
159    for i in 0..5 {
160        let x1 = x_data.data()[i];
161        let pred = final_predictions.data()[i];
162        let true_val = y_true.data()[i];
163        println!(
164            "    x={:.1}: pred={:.3}, true={:.1}, error={:.3}",
165            x1,
166            pred,
167            true_val,
168            (pred - true_val).abs()
169        );
170    }
171
172    Ok(())
173}
174
175/// Demonstrate advanced training patterns
176fn demonstrate_advanced_training() -> Result<(), Box<dyn std::error::Error>> {
177    println!("\n--- Advanced Training Patterns ---");
178
179    // Create a more complex model
180    let mut weight = Tensor::randn(vec![1, 2], Some(44)).with_requires_grad();
181    let mut bias = Tensor::zeros(vec![2]).with_requires_grad();
182
183    // Create optimizer with different learning rate
184    let mut optimizer = Adam::with_learning_rate(0.005);
185    optimizer.add_parameter(&weight);
186    optimizer.add_parameter(&bias);
187
188    // Create training data: y = 2*x + [1, 3]
189    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
190    let y_true = Tensor::from_slice(
191        &[3.0, 5.0, 7.0, 9.0, 11.0, 6.0, 8.0, 10.0, 12.0, 14.0],
192        vec![5, 2],
193    )
194    .unwrap();
195
196    println!("Advanced training with monitoring:");
197    println!("  Initial learning rate: {}", optimizer.learning_rate());
198
199    // Training loop with monitoring
200    let num_epochs = 50;
201    let mut losses = Vec::new();
202    let mut weight_norms = Vec::new();
203    let mut gradient_norms = Vec::new();
204
205    for epoch in 0..num_epochs {
206        // Forward pass
207        let y_pred = x_data.matmul(&weight) + &bias;
208        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
209
210        // Backward pass
211        loss.backward(None);
212
213        // Compute gradient norm before optimizer step
214        let gradient_norm = weight.grad_by_value().unwrap().norm();
215
216        // Optimizer step
217        optimizer.step(&mut [&mut weight, &mut bias]);
218        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
219
220        // Learning rate scheduling: reduce every 10 epochs
221        if epoch > 0 && epoch % 10 == 0 {
222            let current_lr = optimizer.learning_rate();
223            let new_lr = current_lr * 0.5;
224            optimizer.set_learning_rate(new_lr);
225            println!(
226                "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
227                epoch, current_lr, new_lr
228            );
229        }
230
231        // Record metrics
232        losses.push(loss.value());
233        weight_norms.push(weight.norm().value());
234        gradient_norms.push(gradient_norm.value());
235
236        // Print detailed progress
237        if epoch % 10 == 0 || epoch == num_epochs - 1 {
238            println!(
239                "Epoch {:2}: Loss = {:.6}, Weight Norm = {:.6}, Gradient Norm = {:.6}",
240                epoch,
241                loss.value(),
242                weight.norm().value(),
243                gradient_norm.value()
244            );
245        }
246    }
247
248    println!("Final learning rate: {}", optimizer.learning_rate());
249
250    // Analyze training progression
251    let initial_loss = losses[0];
252    let final_loss = losses[losses.len() - 1];
253    let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
254
255    println!("\nTraining Analysis:");
256    println!("  Initial loss: {:.6}", initial_loss);
257    println!("  Final loss: {:.6}", final_loss);
258    println!("  Loss reduction: {:.1}%", loss_reduction);
259    println!("  Final weight norm: {:.6}", weight.norm().value());
260    println!("  Final bias: {:?}", bias.data());
261
262    Ok(())
263}
264
265/// Demonstrate learning rate scheduling
266fn demonstrate_learning_rate_scheduling() -> Result<(), Box<dyn std::error::Error>> {
267    println!("\n--- Learning Rate Scheduling ---");
268
269    // Create simple model
270    let mut weight = Tensor::randn(vec![1, 1], Some(45)).with_requires_grad();
271    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
272
273    // Create optimizer with high initial learning rate
274    let mut optimizer = Adam::with_learning_rate(0.1);
275    optimizer.add_parameter(&weight);
276    optimizer.add_parameter(&bias);
277
278    // Simple data
279    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0], vec![3, 1]).unwrap();
280    let y_true = Tensor::from_slice(&[2.0, 4.0, 6.0], vec![3, 1]).unwrap();
281
282    println!("Initial learning rate: {}", optimizer.learning_rate());
283
284    // Training loop with learning rate scheduling
285    let num_epochs = 50;
286    let mut losses = Vec::new();
287
288    for epoch in 0..num_epochs {
289        // Forward pass
290        let y_pred = x_data.matmul(&weight) + &bias;
291        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
292
293        // Backward pass
294        loss.backward(None);
295
296        // Optimizer step
297        optimizer.step(&mut [&mut weight, &mut bias]);
298        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
299
300        // Learning rate scheduling: reduce every 10 epochs
301        if epoch > 0 && epoch % 10 == 0 {
302            let current_lr = optimizer.learning_rate();
303            let new_lr = current_lr * 0.5;
304            optimizer.set_learning_rate(new_lr);
305            println!(
306                "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
307                epoch, current_lr, new_lr
308            );
309        }
310
311        losses.push(loss.value());
312
313        // Print progress
314        if epoch % 10 == 0 || epoch == num_epochs - 1 {
315            println!(
316                "Epoch {:2}: Loss = {:.6}, LR = {:.3}",
317                epoch,
318                loss.value(),
319                optimizer.learning_rate()
320            );
321        }
322    }
323
324    println!("Final learning rate: {}", optimizer.learning_rate());
325
326    Ok(())
327}
328
329/// Demonstrate training monitoring and analysis
330fn demonstrate_training_monitoring() -> Result<(), Box<dyn std::error::Error>> {
331    println!("\n--- Training Monitoring ---");
332
333    // Create model
334    let mut weight = Tensor::randn(vec![1, 1], Some(46)).with_requires_grad();
335    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
336
337    // Create optimizer
338    let mut optimizer = Adam::with_learning_rate(0.01);
339    optimizer.add_parameter(&weight);
340    optimizer.add_parameter(&bias);
341
342    // Training data
343    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![4, 1]).unwrap();
344    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0], vec![4, 1]).unwrap();
345
346    // Training loop with comprehensive monitoring
347    let num_epochs = 30;
348    let mut losses = Vec::new();
349    let mut weight_history = Vec::new();
350    let mut bias_history = Vec::new();
351
352    for epoch in 0..num_epochs {
353        // Forward pass
354        let y_pred = x_data.matmul(&weight) + &bias;
355        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
356
357        // Backward pass
358        loss.backward(None);
359
360        // Optimizer step
361        optimizer.step(&mut [&mut weight, &mut bias]);
362        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
363
364        // Record history
365        losses.push(loss.value());
366        weight_history.push(weight.value());
367        bias_history.push(bias.value());
368
369        // Print detailed monitoring
370        if epoch % 5 == 0 || epoch == num_epochs - 1 {
371            println!(
372                "Epoch {:2}: Loss = {:.6}, Weight = {:.6}, Bias = {:.6}",
373                epoch,
374                loss.value(),
375                weight.value(),
376                bias.value()
377            );
378        }
379    }
380
381    // Analyze training progression
382    println!("\nTraining Analysis:");
383    println!("  Initial loss: {:.6}", losses[0]);
384    println!("  Final loss: {:.6}", losses[losses.len() - 1]);
385    println!(
386        "  Loss reduction: {:.1}%",
387        (losses[0] - losses[losses.len() - 1]) / losses[0] * 100.0
388    );
389
390    // Compute statistics
391    let loss_mean = compute_mean(&losses);
392    let loss_std = compute_std(&losses);
393    let weight_change = (weight_history[weight_history.len() - 1] - weight_history[0]).abs();
394    let bias_change = (bias_history[bias_history.len() - 1] - bias_history[0]).abs();
395
396    println!("  Average loss: {:.6} ± {:.6}", loss_mean, loss_std);
397    println!("  Weight change: {:.6}", weight_change);
398    println!("  Bias change: {:.6}", bias_change);
399    println!("  Final weight norm: {:.6}", weight.norm().value());
400    println!("  Final bias: {:.6}", bias.value());
401
402    Ok(())
403}
examples/neural_networks/feedforward_network.rs (line 463)
430fn demonstrate_training_workflow() -> Result<(), Box<dyn std::error::Error>> {
431    println!("\n--- Training Workflow ---");
432
433    // Create a simple classification network
434    let config = FeedForwardConfig {
435        input_size: 2,
436        hidden_sizes: vec![4, 3],
437        output_size: 1,
438        use_bias: true,
439    };
440    let mut network = FeedForwardNetwork::new(config, Some(46));
441
442    println!("Training network: 2 -> [4, 3] -> 1");
443
444    // Create simple binary classification data: XOR problem
445    let x_data = Tensor::from_slice(
446        &[
447            0.0, 0.0, // -> 0
448            0.0, 1.0, // -> 1
449            1.0, 0.0, // -> 1
450            1.0, 1.0, // -> 0
451        ],
452        vec![4, 2],
453    )
454    .unwrap();
455
456    let y_true = Tensor::from_slice(&[0.0, 1.0, 1.0, 0.0], vec![4, 1]).unwrap();
457
458    println!("Training on XOR problem:");
459    println!("  Input shape: {:?}", x_data.shape().dims);
460    println!("  Target shape: {:?}", y_true.shape().dims);
461
462    // Create optimizer
463    let mut optimizer = Adam::with_learning_rate(0.1);
464    let params = network.parameters();
465    for param in &params {
466        optimizer.add_parameter(param);
467    }
468
469    // Training loop
470    let num_epochs = 50;
471    let mut losses = Vec::new();
472
473    for epoch in 0..num_epochs {
474        // Forward pass
475        let y_pred = network.forward(&x_data);
476
477        // Compute loss: MSE
478        let diff = y_pred.sub_tensor(&y_true);
479        let mut loss = diff.pow_scalar(2.0).mean();
480
481        // Backward pass
482        loss.backward(None);
483
484        // Optimizer step and zero grad
485        let mut params = network.parameters();
486        optimizer.step(&mut params);
487        optimizer.zero_grad(&mut params);
488
489        losses.push(loss.value());
490
491        // Print progress
492        if epoch % 10 == 0 || epoch == num_epochs - 1 {
493            println!("Epoch {:2}: Loss = {:.6}", epoch, loss.value());
494        }
495    }
496
497    // Test final model
498    let final_predictions = network.forward_no_grad(&x_data);
499    println!("\nFinal predictions vs targets:");
500    for i in 0..4 {
501        let pred = final_predictions.data()[i];
502        let target = y_true.data()[i];
503        let input_x = x_data.data()[i * 2];
504        let input_y = x_data.data()[i * 2 + 1];
505        println!(
506            "  [{:.0}, {:.0}] -> pred: {:.3}, target: {:.0}, error: {:.3}",
507            input_x,
508            input_y,
509            pred,
510            target,
511            (pred - target).abs()
512        );
513    }
514
515    Ok(())
516}
517
518/// Demonstrate comprehensive training with 100+ steps
519fn demonstrate_comprehensive_training() -> Result<(), Box<dyn std::error::Error>> {
520    println!("\n--- Comprehensive Training (100+ Steps) ---");
521
522    // Create a regression network
523    let config = FeedForwardConfig {
524        input_size: 3,
525        hidden_sizes: vec![8, 6, 4],
526        output_size: 2,
527        use_bias: true,
528    };
529    let mut network = FeedForwardNetwork::new(config, Some(47));
530
531    println!("Network architecture: 3 -> [8, 6, 4] -> 2");
532    println!("Total parameters: {}", network.parameter_count());
533
534    // Create synthetic regression data
535    // Target function: [y1, y2] = [x1 + 2*x2 - x3, x1*x2 + x3]
536    let num_samples = 32;
537    let mut x_vec = Vec::new();
538    let mut y_vec = Vec::new();
539
540    for i in 0..num_samples {
541        let x1 = (i as f32 / num_samples as f32) * 2.0 - 1.0; // [-1, 1]
542        let x2 = ((i * 2) as f32 / num_samples as f32) * 2.0 - 1.0;
543        let x3 = ((i * 3) as f32 / num_samples as f32) * 2.0 - 1.0;
544
545        let y1 = x1 + 2.0 * x2 - x3;
546        let y2 = x1 * x2 + x3;
547
548        x_vec.extend_from_slice(&[x1, x2, x3]);
549        y_vec.extend_from_slice(&[y1, y2]);
550    }
551
552    let x_data = Tensor::from_slice(&x_vec, vec![num_samples, 3]).unwrap();
553    let y_true = Tensor::from_slice(&y_vec, vec![num_samples, 2]).unwrap();
554
555    println!("Training data:");
556    println!("  {} samples", num_samples);
557    println!("  Input shape: {:?}", x_data.shape().dims);
558    println!("  Target shape: {:?}", y_true.shape().dims);
559
560    // Create optimizer with learning rate scheduling
561    let mut optimizer = Adam::with_learning_rate(0.01);
562    let params = network.parameters();
563    for param in &params {
564        optimizer.add_parameter(param);
565    }
566
567    // Comprehensive training loop (150 epochs)
568    let num_epochs = 150;
569    let mut losses = Vec::new();
570    let mut best_loss = f32::INFINITY;
571    let mut patience_counter = 0;
572    let patience = 20;
573
574    println!("Starting comprehensive training...");
575
576    for epoch in 0..num_epochs {
577        // Forward pass
578        let y_pred = network.forward(&x_data);
579
580        // Compute loss: MSE
581        let diff = y_pred.sub_tensor(&y_true);
582        let mut loss = diff.pow_scalar(2.0).mean();
583
584        // Backward pass
585        loss.backward(None);
586
587        // Optimizer step and zero grad
588        let mut params = network.parameters();
589        optimizer.step(&mut params);
590        optimizer.zero_grad(&mut params);
591
592        let current_loss = loss.value();
593        losses.push(current_loss);
594
595        // Learning rate scheduling
596        if epoch > 0 && epoch % 30 == 0 {
597            let new_lr = optimizer.learning_rate() * 0.8;
598            optimizer.set_learning_rate(new_lr);
599            println!("  Reduced learning rate to {:.4}", new_lr);
600        }
601
602        // Early stopping logic
603        if current_loss < best_loss {
604            best_loss = current_loss;
605            patience_counter = 0;
606        } else {
607            patience_counter += 1;
608        }
609
610        // Print progress
611        if epoch % 25 == 0 || epoch == num_epochs - 1 {
612            println!(
613                "Epoch {:3}: Loss = {:.6}, LR = {:.4}, Best = {:.6}",
614                epoch,
615                current_loss,
616                optimizer.learning_rate(),
617                best_loss
618            );
619        }
620
621        // Early stopping
622        if patience_counter >= patience && epoch > 50 {
623            println!("Early stopping at epoch {} (patience exceeded)", epoch);
624            break;
625        }
626    }
627
628    // Final evaluation
629    let final_predictions = network.forward_no_grad(&x_data);
630
631    // Compute final metrics
632    let final_loss = losses[losses.len() - 1];
633    let initial_loss = losses[0];
634    let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
635
636    println!("\nTraining completed!");
637    println!("  Initial loss: {:.6}", initial_loss);
638    println!("  Final loss: {:.6}", final_loss);
639    println!("  Best loss: {:.6}", best_loss);
640    println!("  Loss reduction: {:.1}%", loss_reduction);
641    println!("  Final learning rate: {:.4}", optimizer.learning_rate());
642
643    // Sample predictions analysis
644    println!("\nSample predictions (first 5):");
645    for i in 0..5.min(num_samples) {
646        let pred1 = final_predictions.data()[i * 2];
647        let pred2 = final_predictions.data()[i * 2 + 1];
648        let true1 = y_true.data()[i * 2];
649        let true2 = y_true.data()[i * 2 + 1];
650
651        println!(
652            "  Sample {}: pred=[{:.3}, {:.3}], true=[{:.3}, {:.3}], error=[{:.3}, {:.3}]",
653            i + 1,
654            pred1,
655            pred2,
656            true1,
657            true2,
658            (pred1 - true1).abs(),
659            (pred2 - true2).abs()
660        );
661    }
662
663    Ok(())
664}
665
666/// Demonstrate network serialization
667fn demonstrate_network_serialization() -> Result<(), Box<dyn std::error::Error>> {
668    println!("\n--- Network Serialization ---");
669
670    // Create and train a network
671    let config = FeedForwardConfig {
672        input_size: 2,
673        hidden_sizes: vec![4, 2],
674        output_size: 1,
675        use_bias: true,
676    };
677    let mut original_network = FeedForwardNetwork::new(config.clone(), Some(48));
678
679    // Quick training
680    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
681    let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
682
683    let mut optimizer = Adam::with_learning_rate(0.01);
684    let params = original_network.parameters();
685    for param in &params {
686        optimizer.add_parameter(param);
687    }
688
689    for _ in 0..20 {
690        let y_pred = original_network.forward(&x_data);
691        let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
692        loss.backward(None);
693
694        let mut params = original_network.parameters();
695        optimizer.step(&mut params);
696        optimizer.zero_grad(&mut params);
697    }
698
699    // Test original network
700    let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
701    let original_output = original_network.forward_no_grad(&test_input);
702
703    println!("Original network output: {:?}", original_output.data());
704
705    // Save network
706    original_network.save_json("temp_feedforward_network")?;
707
708    // Load network
709    let loaded_network = FeedForwardNetwork::load_json("temp_feedforward_network", config)?;
710    let loaded_output = loaded_network.forward_no_grad(&test_input);
711
712    println!("Loaded network output: {:?}", loaded_output.data());
713
714    // Verify consistency
715    let match_check = original_output
716        .data()
717        .iter()
718        .zip(loaded_output.data().iter())
719        .all(|(a, b)| (a - b).abs() < 1e-6);
720
721    println!(
722        "Serialization verification: {}",
723        if match_check { "PASSED" } else { "FAILED" }
724    );
725
726    Ok(())
727}
Source

pub fn add_parameter(&mut self, parameter: &Tensor)

Add a single parameter to the optimizer

Links a parameter to the optimizer by creating a new parameter state indexed by the tensor’s ID. The parameter must have requires_grad set to true.

§Arguments
  • parameter - Reference to the tensor to link
§Panics

Panics if the parameter does not have requires_grad set to true

Examples found in repository?
examples/getting_started/optimizer_basics.rs (line 74)
47fn demonstrate_basic_optimizer_setup() {
48    println!("--- Basic Optimizer Setup ---");
49
50    // Create parameters that require gradients
51    let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52    let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54    println!("Created parameters:");
55    println!(
56        "  Weight: shape {:?}, requires_grad: {}",
57        weight.shape().dims,
58        weight.requires_grad()
59    );
60    println!(
61        "  Bias: shape {:?}, requires_grad: {}",
62        bias.shape().dims,
63        bias.requires_grad()
64    );
65
66    // Create Adam optimizer with default configuration
67    let mut optimizer = Adam::new();
68    println!(
69        "Created Adam optimizer with learning rate: {}",
70        optimizer.learning_rate()
71    );
72
73    // Add parameters to optimizer
74    optimizer.add_parameter(&weight);
75    optimizer.add_parameter(&bias);
76    println!(
77        "Added {} parameters to optimizer",
78        optimizer.parameter_count()
79    );
80
81    // Create optimizer with custom configuration
82    let config = AdamConfig {
83        learning_rate: 0.01,
84        beta1: 0.9,
85        beta2: 0.999,
86        eps: 1e-8,
87        weight_decay: 0.0,
88        amsgrad: false,
89    };
90
91    let mut custom_optimizer = Adam::with_config(config);
92    custom_optimizer.add_parameter(&weight);
93    custom_optimizer.add_parameter(&bias);
94
95    println!(
96        "Created custom optimizer with learning rate: {}",
97        custom_optimizer.learning_rate()
98    );
99
100    // Demonstrate parameter linking
101    println!("Parameter linking completed successfully");
102}
103
104/// Demonstrate simple linear regression training
105fn demonstrate_linear_regression() -> Result<(), Box<dyn std::error::Error>> {
106    println!("\n--- Linear Regression Training ---");
107
108    // Create model parameters
109    let mut weight = Tensor::randn(vec![1, 1], Some(43)).with_requires_grad();
110    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
111
112    // Create optimizer
113    let mut optimizer = Adam::with_learning_rate(0.01);
114    optimizer.add_parameter(&weight);
115    optimizer.add_parameter(&bias);
116
117    // Create simple training data: y = 2*x + 1
118    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
119    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
120
121    println!("Training data:");
122    println!("  X: {:?}", x_data.data());
123    println!("  Y: {:?}", y_true.data());
124    println!("  Target: y = 2*x + 1");
125
126    // Training loop
127    let num_epochs = 100;
128    let mut losses = Vec::new();
129
130    for epoch in 0..num_epochs {
131        // Forward pass: y_pred = x * weight + bias
132        let y_pred = x_data.matmul(&weight) + &bias;
133
134        // Compute loss: MSE
135        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
136
137        // Backward pass
138        loss.backward(None);
139
140        // Optimizer step
141        optimizer.step(&mut [&mut weight, &mut bias]);
142        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
143
144        losses.push(loss.value());
145
146        // Print progress every 20 epochs
147        if epoch % 20 == 0 || epoch == num_epochs - 1 {
148            println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
149        }
150    }
151
152    // Evaluate final model
153    let final_predictions = x_data.matmul(&weight) + &bias;
154    println!("\nFinal model evaluation:");
155    println!("  Learned weight: {:.6}", weight.value());
156    println!("  Learned bias: {:.6}", bias.value());
157    println!("  Predictions vs True:");
158
159    for i in 0..5 {
160        let x1 = x_data.data()[i];
161        let pred = final_predictions.data()[i];
162        let true_val = y_true.data()[i];
163        println!(
164            "    x={:.1}: pred={:.3}, true={:.1}, error={:.3}",
165            x1,
166            pred,
167            true_val,
168            (pred - true_val).abs()
169        );
170    }
171
172    Ok(())
173}
174
175/// Demonstrate advanced training patterns
176fn demonstrate_advanced_training() -> Result<(), Box<dyn std::error::Error>> {
177    println!("\n--- Advanced Training Patterns ---");
178
179    // Create a more complex model
180    let mut weight = Tensor::randn(vec![1, 2], Some(44)).with_requires_grad();
181    let mut bias = Tensor::zeros(vec![2]).with_requires_grad();
182
183    // Create optimizer with different learning rate
184    let mut optimizer = Adam::with_learning_rate(0.005);
185    optimizer.add_parameter(&weight);
186    optimizer.add_parameter(&bias);
187
188    // Create training data: y = 2*x + [1, 3]
189    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
190    let y_true = Tensor::from_slice(
191        &[3.0, 5.0, 7.0, 9.0, 11.0, 6.0, 8.0, 10.0, 12.0, 14.0],
192        vec![5, 2],
193    )
194    .unwrap();
195
196    println!("Advanced training with monitoring:");
197    println!("  Initial learning rate: {}", optimizer.learning_rate());
198
199    // Training loop with monitoring
200    let num_epochs = 50;
201    let mut losses = Vec::new();
202    let mut weight_norms = Vec::new();
203    let mut gradient_norms = Vec::new();
204
205    for epoch in 0..num_epochs {
206        // Forward pass
207        let y_pred = x_data.matmul(&weight) + &bias;
208        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
209
210        // Backward pass
211        loss.backward(None);
212
213        // Compute gradient norm before optimizer step
214        let gradient_norm = weight.grad_by_value().unwrap().norm();
215
216        // Optimizer step
217        optimizer.step(&mut [&mut weight, &mut bias]);
218        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
219
220        // Learning rate scheduling: reduce every 10 epochs
221        if epoch > 0 && epoch % 10 == 0 {
222            let current_lr = optimizer.learning_rate();
223            let new_lr = current_lr * 0.5;
224            optimizer.set_learning_rate(new_lr);
225            println!(
226                "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
227                epoch, current_lr, new_lr
228            );
229        }
230
231        // Record metrics
232        losses.push(loss.value());
233        weight_norms.push(weight.norm().value());
234        gradient_norms.push(gradient_norm.value());
235
236        // Print detailed progress
237        if epoch % 10 == 0 || epoch == num_epochs - 1 {
238            println!(
239                "Epoch {:2}: Loss = {:.6}, Weight Norm = {:.6}, Gradient Norm = {:.6}",
240                epoch,
241                loss.value(),
242                weight.norm().value(),
243                gradient_norm.value()
244            );
245        }
246    }
247
248    println!("Final learning rate: {}", optimizer.learning_rate());
249
250    // Analyze training progression
251    let initial_loss = losses[0];
252    let final_loss = losses[losses.len() - 1];
253    let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
254
255    println!("\nTraining Analysis:");
256    println!("  Initial loss: {:.6}", initial_loss);
257    println!("  Final loss: {:.6}", final_loss);
258    println!("  Loss reduction: {:.1}%", loss_reduction);
259    println!("  Final weight norm: {:.6}", weight.norm().value());
260    println!("  Final bias: {:?}", bias.data());
261
262    Ok(())
263}
264
265/// Demonstrate learning rate scheduling
266fn demonstrate_learning_rate_scheduling() -> Result<(), Box<dyn std::error::Error>> {
267    println!("\n--- Learning Rate Scheduling ---");
268
269    // Create simple model
270    let mut weight = Tensor::randn(vec![1, 1], Some(45)).with_requires_grad();
271    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
272
273    // Create optimizer with high initial learning rate
274    let mut optimizer = Adam::with_learning_rate(0.1);
275    optimizer.add_parameter(&weight);
276    optimizer.add_parameter(&bias);
277
278    // Simple data
279    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0], vec![3, 1]).unwrap();
280    let y_true = Tensor::from_slice(&[2.0, 4.0, 6.0], vec![3, 1]).unwrap();
281
282    println!("Initial learning rate: {}", optimizer.learning_rate());
283
284    // Training loop with learning rate scheduling
285    let num_epochs = 50;
286    let mut losses = Vec::new();
287
288    for epoch in 0..num_epochs {
289        // Forward pass
290        let y_pred = x_data.matmul(&weight) + &bias;
291        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
292
293        // Backward pass
294        loss.backward(None);
295
296        // Optimizer step
297        optimizer.step(&mut [&mut weight, &mut bias]);
298        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
299
300        // Learning rate scheduling: reduce every 10 epochs
301        if epoch > 0 && epoch % 10 == 0 {
302            let current_lr = optimizer.learning_rate();
303            let new_lr = current_lr * 0.5;
304            optimizer.set_learning_rate(new_lr);
305            println!(
306                "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
307                epoch, current_lr, new_lr
308            );
309        }
310
311        losses.push(loss.value());
312
313        // Print progress
314        if epoch % 10 == 0 || epoch == num_epochs - 1 {
315            println!(
316                "Epoch {:2}: Loss = {:.6}, LR = {:.3}",
317                epoch,
318                loss.value(),
319                optimizer.learning_rate()
320            );
321        }
322    }
323
324    println!("Final learning rate: {}", optimizer.learning_rate());
325
326    Ok(())
327}
328
329/// Demonstrate training monitoring and analysis
330fn demonstrate_training_monitoring() -> Result<(), Box<dyn std::error::Error>> {
331    println!("\n--- Training Monitoring ---");
332
333    // Create model
334    let mut weight = Tensor::randn(vec![1, 1], Some(46)).with_requires_grad();
335    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
336
337    // Create optimizer
338    let mut optimizer = Adam::with_learning_rate(0.01);
339    optimizer.add_parameter(&weight);
340    optimizer.add_parameter(&bias);
341
342    // Training data
343    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![4, 1]).unwrap();
344    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0], vec![4, 1]).unwrap();
345
346    // Training loop with comprehensive monitoring
347    let num_epochs = 30;
348    let mut losses = Vec::new();
349    let mut weight_history = Vec::new();
350    let mut bias_history = Vec::new();
351
352    for epoch in 0..num_epochs {
353        // Forward pass
354        let y_pred = x_data.matmul(&weight) + &bias;
355        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
356
357        // Backward pass
358        loss.backward(None);
359
360        // Optimizer step
361        optimizer.step(&mut [&mut weight, &mut bias]);
362        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
363
364        // Record history
365        losses.push(loss.value());
366        weight_history.push(weight.value());
367        bias_history.push(bias.value());
368
369        // Print detailed monitoring
370        if epoch % 5 == 0 || epoch == num_epochs - 1 {
371            println!(
372                "Epoch {:2}: Loss = {:.6}, Weight = {:.6}, Bias = {:.6}",
373                epoch,
374                loss.value(),
375                weight.value(),
376                bias.value()
377            );
378        }
379    }
380
381    // Analyze training progression
382    println!("\nTraining Analysis:");
383    println!("  Initial loss: {:.6}", losses[0]);
384    println!("  Final loss: {:.6}", losses[losses.len() - 1]);
385    println!(
386        "  Loss reduction: {:.1}%",
387        (losses[0] - losses[losses.len() - 1]) / losses[0] * 100.0
388    );
389
390    // Compute statistics
391    let loss_mean = compute_mean(&losses);
392    let loss_std = compute_std(&losses);
393    let weight_change = (weight_history[weight_history.len() - 1] - weight_history[0]).abs();
394    let bias_change = (bias_history[bias_history.len() - 1] - bias_history[0]).abs();
395
396    println!("  Average loss: {:.6} ± {:.6}", loss_mean, loss_std);
397    println!("  Weight change: {:.6}", weight_change);
398    println!("  Bias change: {:.6}", bias_change);
399    println!("  Final weight norm: {:.6}", weight.norm().value());
400    println!("  Final bias: {:.6}", bias.value());
401
402    Ok(())
403}
More examples
Hide additional examples
examples/getting_started/serialization_basics.rs (line 126)
109fn demonstrate_optimizer_serialization() -> Result<(), Box<dyn std::error::Error>> {
110    println!("\n--- Optimizer Serialization ---");
111
112    // Create an optimizer with some parameters
113    let mut weight = Tensor::randn(vec![2, 2], Some(42)).with_requires_grad();
114    let mut bias = Tensor::randn(vec![2], Some(43)).with_requires_grad();
115
116    let config = AdamConfig {
117        learning_rate: 0.001,
118        beta1: 0.9,
119        beta2: 0.999,
120        eps: 1e-8,
121        weight_decay: 0.0,
122        amsgrad: false,
123    };
124
125    let mut optimizer = Adam::with_config(config);
126    optimizer.add_parameter(&weight);
127    optimizer.add_parameter(&bias);
128
129    println!(
130        "Created optimizer with {} parameters",
131        optimizer.parameter_count()
132    );
133    println!("Learning rate: {}", optimizer.learning_rate());
134
135    // Simulate some training steps
136    for _ in 0..3 {
137        let mut loss = weight.sum() + bias.sum();
138        loss.backward(None);
139        optimizer.step(&mut [&mut weight, &mut bias]);
140        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
141    }
142
143    // Save optimizer state
144    let optimizer_path = "temp_optimizer.json";
145    optimizer.save_json(optimizer_path)?;
146    println!("Saved optimizer to: {}", optimizer_path);
147
148    // Load optimizer state
149    let loaded_optimizer = Adam::load_json(optimizer_path)?;
150    println!(
151        "Loaded optimizer with {} parameters",
152        loaded_optimizer.parameter_count()
153    );
154    println!("Learning rate: {}", loaded_optimizer.learning_rate());
155
156    // Verify optimizer state
157    assert_eq!(
158        optimizer.parameter_count(),
159        loaded_optimizer.parameter_count()
160    );
161    assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
162    println!("Optimizer serialization verification: PASSED");
163
164    Ok(())
165}
166
167/// Demonstrate format comparison and performance characteristics
168fn demonstrate_format_comparison() -> Result<(), Box<dyn std::error::Error>> {
169    println!("\n--- Format Comparison ---");
170
171    // Create a larger tensor for comparison
172    let tensor = Tensor::randn(vec![10, 10], Some(44));
173
174    // Save in both formats
175    tensor.save_json("temp_comparison.json")?;
176    tensor.save_binary("temp_comparison.bin")?;
177
178    // Compare file sizes
179    let json_size = fs::metadata("temp_comparison.json")?.len();
180    let binary_size = fs::metadata("temp_comparison.bin")?.len();
181
182    println!("JSON file size: {} bytes", json_size);
183    println!("Binary file size: {} bytes", binary_size);
184    println!(
185        "Compression ratio: {:.2}x",
186        json_size as f64 / binary_size as f64
187    );
188
189    // Load and verify both formats
190    let json_tensor = Tensor::load_json("temp_comparison.json")?;
191    let binary_tensor = Tensor::load_binary("temp_comparison.bin")?;
192
193    assert_eq!(tensor.shape().dims, json_tensor.shape().dims);
194    assert_eq!(tensor.shape().dims, binary_tensor.shape().dims);
195    assert_eq!(tensor.data(), json_tensor.data());
196    assert_eq!(tensor.data(), binary_tensor.data());
197
198    println!("Format comparison verification: PASSED");
199
200    Ok(())
201}
202
203/// Demonstrate a basic model checkpointing workflow
204fn demonstrate_model_checkpointing() -> Result<(), Box<dyn std::error::Error>> {
205    println!("\n--- Model Checkpointing ---");
206
207    // Create a simple model (weights and bias)
208    let mut weights = Tensor::randn(vec![2, 1], Some(45)).with_requires_grad();
209    let mut bias = Tensor::randn(vec![1], Some(46)).with_requires_grad();
210
211    // Create optimizer
212    let mut optimizer = Adam::with_learning_rate(0.01);
213    optimizer.add_parameter(&weights);
214    optimizer.add_parameter(&bias);
215
216    println!("Initial weights: {:?}", weights.data());
217    println!("Initial bias: {:?}", bias.data());
218
219    // Simulate training
220    for epoch in 0..5 {
221        let mut loss = weights.sum() + bias.sum();
222        loss.backward(None);
223        optimizer.step(&mut [&mut weights, &mut bias]);
224        optimizer.zero_grad(&mut [&mut weights, &mut bias]);
225
226        if epoch % 2 == 0 {
227            // Save checkpoint
228            let checkpoint_dir = format!("checkpoint_epoch_{}", epoch);
229            fs::create_dir_all(&checkpoint_dir)?;
230
231            weights.save_json(format!("{}/weights.json", checkpoint_dir))?;
232            bias.save_json(format!("{}/bias.json", checkpoint_dir))?;
233            optimizer.save_json(format!("{}/optimizer.json", checkpoint_dir))?;
234
235            println!("Saved checkpoint for epoch {}", epoch);
236        }
237    }
238
239    // Load from checkpoint
240    let loaded_weights = Tensor::load_json("checkpoint_epoch_4/weights.json")?;
241    let loaded_bias = Tensor::load_json("checkpoint_epoch_4/bias.json")?;
242    let loaded_optimizer = Adam::load_json("checkpoint_epoch_4/optimizer.json")?;
243
244    println!("Loaded weights: {:?}", loaded_weights.data());
245    println!("Loaded bias: {:?}", loaded_bias.data());
246    println!(
247        "Loaded optimizer learning rate: {}",
248        loaded_optimizer.learning_rate()
249    );
250
251    // Verify checkpoint integrity
252    assert_eq!(weights.shape().dims, loaded_weights.shape().dims);
253    assert_eq!(bias.shape().dims, loaded_bias.shape().dims);
254    assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
255
256    println!("Checkpointing verification: PASSED");
257
258    Ok(())
259}
examples/optimizers/adam_configurations.rs (line 97)
84fn demonstrate_default_adam() -> Result<(), Box<dyn std::error::Error>> {
85    println!("--- Default Adam Configuration ---");
86
87    // Create a simple regression problem: y = 2*x + 1
88    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
89    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
90
91    // Create model parameters
92    let mut weight = Tensor::randn(vec![1, 1], Some(42)).with_requires_grad();
93    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
94
95    // Create Adam optimizer with default configuration
96    let mut optimizer = Adam::new();
97    optimizer.add_parameter(&weight);
98    optimizer.add_parameter(&bias);
99
100    println!("Default Adam configuration:");
101    println!("  Learning rate: {}", optimizer.learning_rate());
102    println!("  Initial weight: {:.6}", weight.value());
103    println!("  Initial bias: {:.6}", bias.value());
104
105    // Training loop
106    let num_epochs = 50;
107    let mut losses = Vec::new();
108
109    for epoch in 0..num_epochs {
110        // Forward pass
111        let y_pred = x_data.matmul(&weight) + &bias;
112        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
113
114        // Backward pass
115        loss.backward(None);
116
117        // Optimizer step
118        optimizer.step(&mut [&mut weight, &mut bias]);
119        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
120
121        losses.push(loss.value());
122
123        if epoch % 10 == 0 || epoch == num_epochs - 1 {
124            println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
125        }
126    }
127
128    // Evaluate final model
129    let _final_predictions = x_data.matmul(&weight) + &bias;
130    println!("\nFinal model:");
131    println!("  Learned weight: {:.6} (target: 2.0)", weight.value());
132    println!("  Learned bias: {:.6} (target: 1.0)", bias.value());
133    println!("  Final loss: {:.6}", losses[losses.len() - 1]);
134
135    Ok(())
136}
137
138/// Demonstrate learning rate comparison
139fn demonstrate_learning_rate_comparison() -> Result<(), Box<dyn std::error::Error>> {
140    println!("\n--- Learning Rate Comparison ---");
141
142    let learning_rates = [0.001, 0.01, 0.1];
143    let mut results = Vec::new();
144
145    for &lr in &learning_rates {
146        println!("\nTesting learning rate: {}", lr);
147
148        let stats = train_with_config(TrainingConfig {
149            learning_rate: lr,
150            ..Default::default()
151        })?;
152
153        results.push((lr, stats.clone()));
154
155        println!("  Final loss: {:.6}", stats.final_loss);
156        println!("  Convergence epoch: {}", stats.convergence_epoch);
157    }
158
159    // Compare results
160    println!("\nLearning Rate Comparison Summary:");
161    for (lr, stats) in &results {
162        println!(
163            "  LR={:6}: Loss={:.6}, Converged@{}",
164            lr, stats.final_loss, stats.convergence_epoch
165        );
166    }
167
168    Ok(())
169}
170
171/// Demonstrate weight decay comparison
172fn demonstrate_weight_decay_comparison() -> Result<(), Box<dyn std::error::Error>> {
173    println!("\n--- Weight Decay Comparison ---");
174
175    let weight_decays = [0.0, 0.001, 0.01];
176    let mut results = Vec::new();
177
178    for &wd in &weight_decays {
179        println!("\nTesting weight decay: {}", wd);
180
181        let stats = train_with_config(TrainingConfig {
182            weight_decay: wd,
183            ..Default::default()
184        })?;
185
186        results.push((wd, stats.clone()));
187
188        println!("  Final loss: {:.6}", stats.final_loss);
189        println!("  Final weight norm: {:.6}", stats.weight_norm);
190    }
191
192    // Compare results
193    println!("\nWeight Decay Comparison Summary:");
194    for (wd, stats) in &results {
195        println!(
196            "  WD={:6}: Loss={:.6}, Weight Norm={:.6}",
197            wd, stats.final_loss, stats.weight_norm
198        );
199    }
200
201    Ok(())
202}
203
204/// Demonstrate beta parameter tuning
205fn demonstrate_beta_parameter_tuning() -> Result<(), Box<dyn std::error::Error>> {
206    println!("\n--- Beta Parameter Tuning ---");
207
208    let beta_configs = [
209        (0.9, 0.999),  // Default
210        (0.8, 0.999),  // More aggressive momentum
211        (0.95, 0.999), // Less aggressive momentum
212        (0.9, 0.99),   // Faster second moment decay
213    ];
214
215    let mut results = Vec::new();
216
217    for (i, (beta1, beta2)) in beta_configs.iter().enumerate() {
218        println!(
219            "\nTesting beta configuration {}: beta1={}, beta2={}",
220            i + 1,
221            beta1,
222            beta2
223        );
224
225        let config = TrainingConfig {
226            beta1: *beta1,
227            beta2: *beta2,
228            ..Default::default()
229        };
230
231        let stats = train_with_config(config)?;
232        results.push(((*beta1, *beta2), stats.clone()));
233
234        println!("  Final loss: {:.6}", stats.final_loss);
235        println!("  Convergence epoch: {}", stats.convergence_epoch);
236    }
237
238    // Compare results
239    println!("\nBeta Parameter Comparison Summary:");
240    for ((beta1, beta2), stats) in &results {
241        println!(
242            "  B1={:4}, B2={:5}: Loss={:.6}, Converged@{}",
243            beta1, beta2, stats.final_loss, stats.convergence_epoch
244        );
245    }
246
247    Ok(())
248}
249
250/// Demonstrate configuration benchmarking
251fn demonstrate_configuration_benchmarking() -> Result<(), Box<dyn std::error::Error>> {
252    println!("\n--- Configuration Benchmarking ---");
253
254    // Define configurations to benchmark
255    let configs = vec![
256        (
257            "Conservative",
258            TrainingConfig {
259                learning_rate: 0.001,
260                weight_decay: 0.001,
261                beta1: 0.95,
262                ..Default::default()
263            },
264        ),
265        (
266            "Balanced",
267            TrainingConfig {
268                learning_rate: 0.01,
269                weight_decay: 0.0,
270                beta1: 0.9,
271                ..Default::default()
272            },
273        ),
274        (
275            "Aggressive",
276            TrainingConfig {
277                learning_rate: 0.1,
278                weight_decay: 0.0,
279                beta1: 0.8,
280                ..Default::default()
281            },
282        ),
283    ];
284
285    let mut benchmark_results = Vec::new();
286
287    for (name, config) in configs {
288        println!("\nBenchmarking {} configuration:", name);
289
290        let start_time = std::time::Instant::now();
291        let stats = train_with_config(config.clone())?;
292        let elapsed = start_time.elapsed();
293
294        println!("  Training time: {:.2}ms", elapsed.as_millis());
295        println!("  Final loss: {:.6}", stats.final_loss);
296        println!("  Convergence: {} epochs", stats.convergence_epoch);
297
298        benchmark_results.push((name.to_string(), stats, elapsed));
299    }
300
301    // Summary
302    println!("\nBenchmarking Summary:");
303    for (name, stats, elapsed) in &benchmark_results {
304        println!(
305            "  {:12}: Loss={:.6}, Time={:4}ms, Converged@{}",
306            name,
307            stats.final_loss,
308            elapsed.as_millis(),
309            stats.convergence_epoch
310        );
311    }
312
313    Ok(())
314}
315
316/// Helper function to train with specific configuration
317fn train_with_config(config: TrainingConfig) -> Result<TrainingStats, Box<dyn std::error::Error>> {
318    // Create training data
319    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
320    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
321
322    // Create model parameters
323    let mut weight = Tensor::randn(vec![1, 1], Some(123)).with_requires_grad();
324    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
325
326    // Create optimizer with custom configuration
327    let adam_config = AdamConfig {
328        learning_rate: config.learning_rate,
329        beta1: config.beta1,
330        beta2: config.beta2,
331        eps: 1e-8,
332        weight_decay: config.weight_decay,
333        amsgrad: false,
334    };
335
336    let mut optimizer = Adam::with_config(adam_config);
337    optimizer.add_parameter(&weight);
338    optimizer.add_parameter(&bias);
339
340    // Training loop
341    let mut losses = Vec::new();
342    let mut convergence_epoch = config.epochs;
343
344    for epoch in 0..config.epochs {
345        // Forward pass
346        let y_pred = x_data.matmul(&weight) + &bias;
347        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
348
349        // Backward pass
350        loss.backward(None);
351
352        // Optimizer step
353        optimizer.step(&mut [&mut weight, &mut bias]);
354        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
355
356        let loss_value = loss.value();
357        losses.push(loss_value);
358
359        // Check for convergence (loss < 0.01)
360        if loss_value < 0.01 && convergence_epoch == config.epochs {
361            convergence_epoch = epoch;
362        }
363    }
364
365    Ok(TrainingStats {
366        config,
367        final_loss: losses[losses.len() - 1],
368        loss_history: losses,
369        convergence_epoch,
370        weight_norm: weight.norm().value(),
371    })
372}
examples/optimizers/learning_rate_scheduling.rs (line 333)
319fn train_with_scheduler(
320    scheduler: &mut dyn LearningRateScheduler,
321    num_epochs: usize,
322) -> Result<TrainingStats, Box<dyn std::error::Error>> {
323    // Create training data: y = 2*x + 1
324    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
325    let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
326
327    // Create model parameters
328    let mut weight = Tensor::randn(vec![1, 1], Some(456)).with_requires_grad();
329    let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
330
331    // Create optimizer with initial learning rate
332    let mut optimizer = Adam::with_learning_rate(0.05);
333    optimizer.add_parameter(&weight);
334    optimizer.add_parameter(&bias);
335
336    // Training loop
337    let mut losses = Vec::new();
338    let mut lr_history = Vec::new();
339    let mut convergence_epoch = num_epochs;
340
341    for epoch in 0..num_epochs {
342        // Forward pass
343        let y_pred = x_data.matmul(&weight) + &bias;
344        let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
345
346        // Backward pass
347        loss.backward(None);
348
349        // Update learning rate using scheduler
350        let current_lr = optimizer.learning_rate();
351        let new_lr = scheduler.step(current_lr, epoch, loss.value());
352
353        if (new_lr - current_lr).abs() > 1e-8 {
354            optimizer.set_learning_rate(new_lr);
355        }
356
357        // Optimizer step
358        optimizer.step(&mut [&mut weight, &mut bias]);
359        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
360
361        let loss_value = loss.value();
362        losses.push(loss_value);
363        lr_history.push(new_lr);
364
365        // Check for convergence
366        if loss_value < 0.01 && convergence_epoch == num_epochs {
367            convergence_epoch = epoch;
368        }
369    }
370
371    Ok(TrainingStats {
372        scheduler_name: scheduler.name().to_string(),
373        final_loss: losses[losses.len() - 1],
374        lr_history,
375        loss_history: losses,
376        convergence_epoch,
377    })
378}
examples/neural_networks/feedforward_network.rs (line 466)
430fn demonstrate_training_workflow() -> Result<(), Box<dyn std::error::Error>> {
431    println!("\n--- Training Workflow ---");
432
433    // Create a simple classification network
434    let config = FeedForwardConfig {
435        input_size: 2,
436        hidden_sizes: vec![4, 3],
437        output_size: 1,
438        use_bias: true,
439    };
440    let mut network = FeedForwardNetwork::new(config, Some(46));
441
442    println!("Training network: 2 -> [4, 3] -> 1");
443
444    // Create simple binary classification data: XOR problem
445    let x_data = Tensor::from_slice(
446        &[
447            0.0, 0.0, // -> 0
448            0.0, 1.0, // -> 1
449            1.0, 0.0, // -> 1
450            1.0, 1.0, // -> 0
451        ],
452        vec![4, 2],
453    )
454    .unwrap();
455
456    let y_true = Tensor::from_slice(&[0.0, 1.0, 1.0, 0.0], vec![4, 1]).unwrap();
457
458    println!("Training on XOR problem:");
459    println!("  Input shape: {:?}", x_data.shape().dims);
460    println!("  Target shape: {:?}", y_true.shape().dims);
461
462    // Create optimizer
463    let mut optimizer = Adam::with_learning_rate(0.1);
464    let params = network.parameters();
465    for param in &params {
466        optimizer.add_parameter(param);
467    }
468
469    // Training loop
470    let num_epochs = 50;
471    let mut losses = Vec::new();
472
473    for epoch in 0..num_epochs {
474        // Forward pass
475        let y_pred = network.forward(&x_data);
476
477        // Compute loss: MSE
478        let diff = y_pred.sub_tensor(&y_true);
479        let mut loss = diff.pow_scalar(2.0).mean();
480
481        // Backward pass
482        loss.backward(None);
483
484        // Optimizer step and zero grad
485        let mut params = network.parameters();
486        optimizer.step(&mut params);
487        optimizer.zero_grad(&mut params);
488
489        losses.push(loss.value());
490
491        // Print progress
492        if epoch % 10 == 0 || epoch == num_epochs - 1 {
493            println!("Epoch {:2}: Loss = {:.6}", epoch, loss.value());
494        }
495    }
496
497    // Test final model
498    let final_predictions = network.forward_no_grad(&x_data);
499    println!("\nFinal predictions vs targets:");
500    for i in 0..4 {
501        let pred = final_predictions.data()[i];
502        let target = y_true.data()[i];
503        let input_x = x_data.data()[i * 2];
504        let input_y = x_data.data()[i * 2 + 1];
505        println!(
506            "  [{:.0}, {:.0}] -> pred: {:.3}, target: {:.0}, error: {:.3}",
507            input_x,
508            input_y,
509            pred,
510            target,
511            (pred - target).abs()
512        );
513    }
514
515    Ok(())
516}
517
518/// Demonstrate comprehensive training with 100+ steps
519fn demonstrate_comprehensive_training() -> Result<(), Box<dyn std::error::Error>> {
520    println!("\n--- Comprehensive Training (100+ Steps) ---");
521
522    // Create a regression network
523    let config = FeedForwardConfig {
524        input_size: 3,
525        hidden_sizes: vec![8, 6, 4],
526        output_size: 2,
527        use_bias: true,
528    };
529    let mut network = FeedForwardNetwork::new(config, Some(47));
530
531    println!("Network architecture: 3 -> [8, 6, 4] -> 2");
532    println!("Total parameters: {}", network.parameter_count());
533
534    // Create synthetic regression data
535    // Target function: [y1, y2] = [x1 + 2*x2 - x3, x1*x2 + x3]
536    let num_samples = 32;
537    let mut x_vec = Vec::new();
538    let mut y_vec = Vec::new();
539
540    for i in 0..num_samples {
541        let x1 = (i as f32 / num_samples as f32) * 2.0 - 1.0; // [-1, 1]
542        let x2 = ((i * 2) as f32 / num_samples as f32) * 2.0 - 1.0;
543        let x3 = ((i * 3) as f32 / num_samples as f32) * 2.0 - 1.0;
544
545        let y1 = x1 + 2.0 * x2 - x3;
546        let y2 = x1 * x2 + x3;
547
548        x_vec.extend_from_slice(&[x1, x2, x3]);
549        y_vec.extend_from_slice(&[y1, y2]);
550    }
551
552    let x_data = Tensor::from_slice(&x_vec, vec![num_samples, 3]).unwrap();
553    let y_true = Tensor::from_slice(&y_vec, vec![num_samples, 2]).unwrap();
554
555    println!("Training data:");
556    println!("  {} samples", num_samples);
557    println!("  Input shape: {:?}", x_data.shape().dims);
558    println!("  Target shape: {:?}", y_true.shape().dims);
559
560    // Create optimizer with learning rate scheduling
561    let mut optimizer = Adam::with_learning_rate(0.01);
562    let params = network.parameters();
563    for param in &params {
564        optimizer.add_parameter(param);
565    }
566
567    // Comprehensive training loop (150 epochs)
568    let num_epochs = 150;
569    let mut losses = Vec::new();
570    let mut best_loss = f32::INFINITY;
571    let mut patience_counter = 0;
572    let patience = 20;
573
574    println!("Starting comprehensive training...");
575
576    for epoch in 0..num_epochs {
577        // Forward pass
578        let y_pred = network.forward(&x_data);
579
580        // Compute loss: MSE
581        let diff = y_pred.sub_tensor(&y_true);
582        let mut loss = diff.pow_scalar(2.0).mean();
583
584        // Backward pass
585        loss.backward(None);
586
587        // Optimizer step and zero grad
588        let mut params = network.parameters();
589        optimizer.step(&mut params);
590        optimizer.zero_grad(&mut params);
591
592        let current_loss = loss.value();
593        losses.push(current_loss);
594
595        // Learning rate scheduling
596        if epoch > 0 && epoch % 30 == 0 {
597            let new_lr = optimizer.learning_rate() * 0.8;
598            optimizer.set_learning_rate(new_lr);
599            println!("  Reduced learning rate to {:.4}", new_lr);
600        }
601
602        // Early stopping logic
603        if current_loss < best_loss {
604            best_loss = current_loss;
605            patience_counter = 0;
606        } else {
607            patience_counter += 1;
608        }
609
610        // Print progress
611        if epoch % 25 == 0 || epoch == num_epochs - 1 {
612            println!(
613                "Epoch {:3}: Loss = {:.6}, LR = {:.4}, Best = {:.6}",
614                epoch,
615                current_loss,
616                optimizer.learning_rate(),
617                best_loss
618            );
619        }
620
621        // Early stopping
622        if patience_counter >= patience && epoch > 50 {
623            println!("Early stopping at epoch {} (patience exceeded)", epoch);
624            break;
625        }
626    }
627
628    // Final evaluation
629    let final_predictions = network.forward_no_grad(&x_data);
630
631    // Compute final metrics
632    let final_loss = losses[losses.len() - 1];
633    let initial_loss = losses[0];
634    let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
635
636    println!("\nTraining completed!");
637    println!("  Initial loss: {:.6}", initial_loss);
638    println!("  Final loss: {:.6}", final_loss);
639    println!("  Best loss: {:.6}", best_loss);
640    println!("  Loss reduction: {:.1}%", loss_reduction);
641    println!("  Final learning rate: {:.4}", optimizer.learning_rate());
642
643    // Sample predictions analysis
644    println!("\nSample predictions (first 5):");
645    for i in 0..5.min(num_samples) {
646        let pred1 = final_predictions.data()[i * 2];
647        let pred2 = final_predictions.data()[i * 2 + 1];
648        let true1 = y_true.data()[i * 2];
649        let true2 = y_true.data()[i * 2 + 1];
650
651        println!(
652            "  Sample {}: pred=[{:.3}, {:.3}], true=[{:.3}, {:.3}], error=[{:.3}, {:.3}]",
653            i + 1,
654            pred1,
655            pred2,
656            true1,
657            true2,
658            (pred1 - true1).abs(),
659            (pred2 - true2).abs()
660        );
661    }
662
663    Ok(())
664}
665
666/// Demonstrate network serialization
667fn demonstrate_network_serialization() -> Result<(), Box<dyn std::error::Error>> {
668    println!("\n--- Network Serialization ---");
669
670    // Create and train a network
671    let config = FeedForwardConfig {
672        input_size: 2,
673        hidden_sizes: vec![4, 2],
674        output_size: 1,
675        use_bias: true,
676    };
677    let mut original_network = FeedForwardNetwork::new(config.clone(), Some(48));
678
679    // Quick training
680    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
681    let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
682
683    let mut optimizer = Adam::with_learning_rate(0.01);
684    let params = original_network.parameters();
685    for param in &params {
686        optimizer.add_parameter(param);
687    }
688
689    for _ in 0..20 {
690        let y_pred = original_network.forward(&x_data);
691        let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
692        loss.backward(None);
693
694        let mut params = original_network.parameters();
695        optimizer.step(&mut params);
696        optimizer.zero_grad(&mut params);
697    }
698
699    // Test original network
700    let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
701    let original_output = original_network.forward_no_grad(&test_input);
702
703    println!("Original network output: {:?}", original_output.data());
704
705    // Save network
706    original_network.save_json("temp_feedforward_network")?;
707
708    // Load network
709    let loaded_network = FeedForwardNetwork::load_json("temp_feedforward_network", config)?;
710    let loaded_output = loaded_network.forward_no_grad(&test_input);
711
712    println!("Loaded network output: {:?}", loaded_output.data());
713
714    // Verify consistency
715    let match_check = original_output
716        .data()
717        .iter()
718        .zip(loaded_output.data().iter())
719        .all(|(a, b)| (a - b).abs() < 1e-6);
720
721    println!(
722        "Serialization verification: {}",
723        if match_check { "PASSED" } else { "FAILED" }
724    );
725
726    Ok(())
727}
examples/neural_networks/basic_linear_layer.rs (line 255)
217fn demonstrate_training_loop() -> Result<(), Box<dyn std::error::Error>> {
218    println!("\n--- Training Loop ---");
219
220    // Create layer and training data
221    let mut layer = LinearLayer::new(2, 1, Some(45));
222
223    // Simple regression task: y = 2*x1 + 3*x2 + 1
224    let x_data = Tensor::from_slice(
225        &[
226            1.0, 1.0, // x1=1, x2=1 -> y=6
227            2.0, 1.0, // x1=2, x2=1 -> y=8
228            1.0, 2.0, // x1=1, x2=2 -> y=9
229            2.0, 2.0, // x1=2, x2=2 -> y=11
230        ],
231        vec![4, 2],
232    )
233    .unwrap();
234
235    let y_true = Tensor::from_slice(&[6.0, 8.0, 9.0, 11.0], vec![4, 1]).unwrap();
236
237    println!("Training data:");
238    println!("  X shape: {:?}", x_data.shape().dims);
239    println!("  Y shape: {:?}", y_true.shape().dims);
240    println!("  Target function: y = 2*x1 + 3*x2 + 1");
241
242    // Create optimizer
243    let config = AdamConfig {
244        learning_rate: 0.01,
245        beta1: 0.9,
246        beta2: 0.999,
247        eps: 1e-8,
248        weight_decay: 0.0,
249        amsgrad: false,
250    };
251
252    let mut optimizer = Adam::with_config(config);
253    let params = layer.parameters();
254    for param in &params {
255        optimizer.add_parameter(param);
256    }
257
258    println!("Optimizer setup complete. Starting training...");
259
260    // Training loop
261    let num_epochs = 100;
262    let mut losses = Vec::new();
263
264    for epoch in 0..num_epochs {
265        // Forward pass
266        let y_pred = layer.forward(&x_data);
267
268        // Compute loss: MSE
269        let diff = y_pred.sub_tensor(&y_true);
270        let mut loss = diff.pow_scalar(2.0).mean();
271
272        // Backward pass
273        loss.backward(None);
274
275        // Optimizer step
276        let mut params = layer.parameters();
277        optimizer.step(&mut params);
278        optimizer.zero_grad(&mut params);
279
280        losses.push(loss.value());
281
282        // Print progress
283        if epoch % 20 == 0 || epoch == num_epochs - 1 {
284            println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
285        }
286    }
287
288    // Evaluate final model
289    let final_predictions = layer.forward_no_grad(&x_data);
290
291    println!("\nFinal model evaluation:");
292    println!("  Learned weights: {:?}", layer.weight.data());
293    println!("  Learned bias: {:?}", layer.bias.data());
294    println!("  Target weights: [2.0, 3.0]");
295    println!("  Target bias: [1.0]");
296
297    println!("  Predictions vs True:");
298    for i in 0..4 {
299        let pred = final_predictions.data()[i];
300        let true_val = y_true.data()[i];
301        println!(
302            "    Sample {}: pred={:.3}, true={:.1}, error={:.3}",
303            i + 1,
304            pred,
305            true_val,
306            (pred - true_val).abs()
307        );
308    }
309
310    // Training analysis
311    let initial_loss = losses[0];
312    let final_loss = losses[losses.len() - 1];
313    let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
314
315    println!("\nTraining Analysis:");
316    println!("  Initial loss: {:.6}", initial_loss);
317    println!("  Final loss: {:.6}", final_loss);
318    println!("  Loss reduction: {:.1}%", loss_reduction);
319
320    Ok(())
321}
322
323/// Demonstrate single vs batch inference
324fn demonstrate_single_vs_batch_inference() {
325    println!("\n--- Single vs Batch Inference ---");
326
327    let layer = LinearLayer::new(4, 3, Some(46));
328
329    // Single inference
330    println!("Single inference:");
331    let single_input = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![1, 4]).unwrap();
332    let single_output = layer.forward_no_grad(&single_input);
333    println!("  Input shape: {:?}", single_input.shape().dims);
334    println!("  Output shape: {:?}", single_output.shape().dims);
335    println!("  Output: {:?}", single_output.data());
336
337    // Batch inference
338    println!("Batch inference:");
339    let batch_input = Tensor::from_slice(
340        &[
341            1.0, 2.0, 3.0, 4.0, // Sample 1
342            5.0, 6.0, 7.0, 8.0, // Sample 2
343            9.0, 10.0, 11.0, 12.0, // Sample 3
344        ],
345        vec![3, 4],
346    )
347    .unwrap();
348    let batch_output = layer.forward_no_grad(&batch_input);
349    println!("  Input shape: {:?}", batch_input.shape().dims);
350    println!("  Output shape: {:?}", batch_output.shape().dims);
351
352    // Verify batch consistency - first sample should match single inference
353    let _first_batch_sample = batch_output.view(vec![3, 3]); // Reshape to access first sample
354    let first_sample_data = &batch_output.data()[0..3]; // First 3 elements
355    let single_sample_data = single_output.data();
356
357    println!("Consistency check:");
358    println!("  Single output: {:?}", single_sample_data);
359    println!("  First batch sample: {:?}", first_sample_data);
360    println!(
361        "  Match: {}",
362        single_sample_data
363            .iter()
364            .zip(first_sample_data.iter())
365            .all(|(a, b)| (a - b).abs() < 1e-6)
366    );
367}
368
369/// Demonstrate serialization and loading
370fn demonstrate_serialization() -> Result<(), Box<dyn std::error::Error>> {
371    println!("\n--- Serialization ---");
372
373    // Create and train a simple layer
374    let mut original_layer = LinearLayer::new(2, 1, Some(47));
375
376    // Simple training data
377    let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
378    let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
379
380    let mut optimizer = Adam::with_learning_rate(0.01);
381    let params = original_layer.parameters();
382    for param in &params {
383        optimizer.add_parameter(param);
384    }
385
386    // Train for a few epochs
387    for _ in 0..10 {
388        let y_pred = original_layer.forward(&x_data);
389        let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
390        loss.backward(None);
391
392        let mut params = original_layer.parameters();
393        optimizer.step(&mut params);
394        optimizer.zero_grad(&mut params);
395    }
396
397    println!("Original layer trained");
398    println!("  Weight: {:?}", original_layer.weight.data());
399    println!("  Bias: {:?}", original_layer.bias.data());
400
401    // Save layer
402    original_layer.save_json("temp_linear_layer")?;
403
404    // Load layer
405    let loaded_layer = LinearLayer::load_json("temp_linear_layer", 2, 1)?;
406
407    println!("Loaded layer");
408    println!("  Weight: {:?}", loaded_layer.weight.data());
409    println!("  Bias: {:?}", loaded_layer.bias.data());
410
411    // Verify consistency
412    let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
413    let original_output = original_layer.forward_no_grad(&test_input);
414    let loaded_output = loaded_layer.forward_no_grad(&test_input);
415
416    println!("Consistency check:");
417    println!("  Original output: {:?}", original_output.data());
418    println!("  Loaded output: {:?}", loaded_output.data());
419    println!(
420        "  Match: {}",
421        original_output
422            .data()
423            .iter()
424            .zip(loaded_output.data().iter())
425            .all(|(a, b)| (a - b).abs() < 1e-6)
426    );
427
428    println!("Serialization verification: PASSED");
429
430    Ok(())
431}
Source

pub fn add_parameters(&mut self, parameters: &[&Tensor])

Add multiple parameters to the optimizer

Links multiple parameters to the optimizer by creating parameter states indexed by each tensor’s ID. All parameters must have requires_grad set to true.

§Arguments
  • parameters - Slice of references to tensors to link
§Panics

Panics if any parameter does not have requires_grad set to true

Remove a parameter from the optimizer

Unlinks a parameter by removing its state from the optimizer. The parameter ID is used for identification.

§Arguments
  • parameter - Reference to the tensor to unlink
§Returns

True if the parameter was linked and removed, false if it was not linked

Source

pub fn clear_states(&mut self)

Remove all parameter states from the optimizer

Clears all parameter states, effectively unlinking all parameters. This is useful for resetting the optimizer or preparing for parameter re-linking.

Source

pub fn is_parameter_linked(&self, parameter: &Tensor) -> bool

Check if a parameter is linked to the optimizer

Returns true if the parameter has an associated state in the optimizer.

§Arguments
  • parameter - Reference to the tensor to check
§Returns

True if the parameter is linked, false otherwise

Source

pub fn parameter_count(&self) -> usize

Get the number of linked parameters

Returns the count of parameters currently linked to the optimizer.

§Returns

Number of linked parameters

Examples found in repository?
examples/getting_started/optimizer_basics.rs (line 78)
47fn demonstrate_basic_optimizer_setup() {
48    println!("--- Basic Optimizer Setup ---");
49
50    // Create parameters that require gradients
51    let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52    let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54    println!("Created parameters:");
55    println!(
56        "  Weight: shape {:?}, requires_grad: {}",
57        weight.shape().dims,
58        weight.requires_grad()
59    );
60    println!(
61        "  Bias: shape {:?}, requires_grad: {}",
62        bias.shape().dims,
63        bias.requires_grad()
64    );
65
66    // Create Adam optimizer with default configuration
67    let mut optimizer = Adam::new();
68    println!(
69        "Created Adam optimizer with learning rate: {}",
70        optimizer.learning_rate()
71    );
72
73    // Add parameters to optimizer
74    optimizer.add_parameter(&weight);
75    optimizer.add_parameter(&bias);
76    println!(
77        "Added {} parameters to optimizer",
78        optimizer.parameter_count()
79    );
80
81    // Create optimizer with custom configuration
82    let config = AdamConfig {
83        learning_rate: 0.01,
84        beta1: 0.9,
85        beta2: 0.999,
86        eps: 1e-8,
87        weight_decay: 0.0,
88        amsgrad: false,
89    };
90
91    let mut custom_optimizer = Adam::with_config(config);
92    custom_optimizer.add_parameter(&weight);
93    custom_optimizer.add_parameter(&bias);
94
95    println!(
96        "Created custom optimizer with learning rate: {}",
97        custom_optimizer.learning_rate()
98    );
99
100    // Demonstrate parameter linking
101    println!("Parameter linking completed successfully");
102}
More examples
Hide additional examples
examples/getting_started/serialization_basics.rs (line 131)
109fn demonstrate_optimizer_serialization() -> Result<(), Box<dyn std::error::Error>> {
110    println!("\n--- Optimizer Serialization ---");
111
112    // Create an optimizer with some parameters
113    let mut weight = Tensor::randn(vec![2, 2], Some(42)).with_requires_grad();
114    let mut bias = Tensor::randn(vec![2], Some(43)).with_requires_grad();
115
116    let config = AdamConfig {
117        learning_rate: 0.001,
118        beta1: 0.9,
119        beta2: 0.999,
120        eps: 1e-8,
121        weight_decay: 0.0,
122        amsgrad: false,
123    };
124
125    let mut optimizer = Adam::with_config(config);
126    optimizer.add_parameter(&weight);
127    optimizer.add_parameter(&bias);
128
129    println!(
130        "Created optimizer with {} parameters",
131        optimizer.parameter_count()
132    );
133    println!("Learning rate: {}", optimizer.learning_rate());
134
135    // Simulate some training steps
136    for _ in 0..3 {
137        let mut loss = weight.sum() + bias.sum();
138        loss.backward(None);
139        optimizer.step(&mut [&mut weight, &mut bias]);
140        optimizer.zero_grad(&mut [&mut weight, &mut bias]);
141    }
142
143    // Save optimizer state
144    let optimizer_path = "temp_optimizer.json";
145    optimizer.save_json(optimizer_path)?;
146    println!("Saved optimizer to: {}", optimizer_path);
147
148    // Load optimizer state
149    let loaded_optimizer = Adam::load_json(optimizer_path)?;
150    println!(
151        "Loaded optimizer with {} parameters",
152        loaded_optimizer.parameter_count()
153    );
154    println!("Learning rate: {}", loaded_optimizer.learning_rate());
155
156    // Verify optimizer state
157    assert_eq!(
158        optimizer.parameter_count(),
159        loaded_optimizer.parameter_count()
160    );
161    assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
162    println!("Optimizer serialization verification: PASSED");
163
164    Ok(())
165}

Re-link parameters to saved optimizer states in chronological order

After deserializing an optimizer, use this method to restore saved parameter states to new tensors. Parameters must be provided in the same chronological order they were originally added to the optimizer. Shape validation ensures parameter compatibility.

§Arguments
  • parameters - Slice of parameter references in chronological order
§Returns

Result indicating success or failure with detailed error message

§Panics

Panics if any parameter does not have requires_grad set to true

Source

pub fn config(&self) -> &AdamConfig

Get the current optimizer configuration

Returns a reference to the current configuration, allowing inspection of all hyperparameters without modification.

§Returns

Reference to the current Adam configuration

Trait Implementations§

Source§

impl Default for Adam

Source§

fn default() -> Self

Returns the “default value” for a type. Read more
Source§

impl FromFieldValue for Adam

Source§

fn from_field_value( value: FieldValue, field_name: &str, ) -> SerializationResult<Self>

Create Adam from FieldValue

§Arguments
  • value - FieldValue containing optimizer data
  • field_name - Name of the field being deserialized (for error messages)
§Returns

Reconstructed Adam instance or error if deserialization fails

Source§

impl Optimizer for Adam

Source§

fn step(&mut self, parameters: &mut [&mut Tensor])

Perform a single optimization step

Updates all provided parameters based on their accumulated gradients using the Adam algorithm. Each parameter is updated according to the Adam update rule with bias correction and optional AMSGrad variant if enabled. All parameters must be linked to the optimizer before calling this method.

§Arguments
  • parameters - Mutable slice of parameter references to update
§Thread Safety

This method is thread-safe as it takes mutable references to parameters, ensuring exclusive access during updates.

§Performance
  • Uses SIMD optimization (AVX2) when available for 8x vectorization
  • Processes parameters in sequence for optimal cache usage
  • Maintains per-parameter state for momentum and velocity estimates
§Panics

Panics if any parameter is not linked to the optimizer

Source§

fn zero_grad(&mut self, parameters: &mut [&mut Tensor])

Zero out all parameter gradients

Clears accumulated gradients for all provided parameters. This should be called before each backward pass to prevent gradient accumulation across multiple forward/backward passes. Also clears the global autograd gradient map.

§Arguments
  • parameters - Mutable slice of parameter references to clear gradients for
§Performance
  • Efficiently clears gradients using optimized tensor operations
  • Clears both per-tensor gradients and global autograd state
  • Thread-safe as it takes mutable references to parameters
Source§

fn learning_rate(&self) -> f32

Get the current learning rate

Returns the current learning rate used for parameter updates.

§Returns

Current learning rate as f32

Source§

fn set_learning_rate(&mut self, lr: f32)

Set the learning rate for all parameters

Updates the learning rate for all parameters in the optimizer. This allows dynamic learning rate scheduling during training.

§Arguments
  • lr - New learning rate value
Source§

impl Serializable for Adam

Source§

fn to_json(&self) -> SerializationResult<String>

Serialize the Adam optimizer to JSON format

This method converts the Adam optimizer into a human-readable JSON string representation that includes all optimizer state, configuration, parameter states, and step counts. The JSON format is suitable for debugging, configuration files, and cross-language interoperability.

§Returns

JSON string representation of the optimizer on success, or SerializationError on failure

§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;

let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);

let json = optimizer.to_json().unwrap();
assert!(!json.is_empty());
Source§

fn from_json(json: &str) -> SerializationResult<Self>

Deserialize an Adam optimizer from JSON format

This method parses a JSON string and reconstructs an Adam optimizer with all saved state. Parameters must be re-linked after deserialization using add_parameter or relink_parameters.

§Arguments
  • json - JSON string containing serialized optimizer
§Returns

The deserialized optimizer on success, or SerializationError on failure

§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;

let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);

let json = optimizer.to_json().unwrap();
let loaded_optimizer = Adam::from_json(&json).unwrap();
assert_eq!(loaded_optimizer.saved_parameter_count(), 1);
Source§

fn to_binary(&self) -> SerializationResult<Vec<u8>>

Serialize the Adam optimizer to binary format

This method converts the optimizer into a compact binary representation optimized for storage and transmission. The binary format provides maximum performance and minimal file sizes compared to JSON.

§Returns

Binary representation of the optimizer on success, or SerializationError on failure

§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;

let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);

let binary = optimizer.to_binary().unwrap();
assert!(!binary.is_empty());
Source§

fn from_binary(data: &[u8]) -> SerializationResult<Self>

Deserialize an Adam optimizer from binary format

This method parses binary data and reconstructs an Adam optimizer with all saved state. Parameters must be re-linked after deserialization using add_parameter or relink_parameters.

§Arguments
  • data - Binary data containing serialized optimizer
§Returns

The deserialized optimizer on success, or SerializationError on failure

§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;

let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);

let binary = optimizer.to_binary().unwrap();
let loaded_optimizer = Adam::from_binary(&binary).unwrap();
assert_eq!(loaded_optimizer.saved_parameter_count(), 1);
Source§

fn save<P: AsRef<Path>>( &self, path: P, format: Format, ) -> SerializationResult<()>

Save the object to a file in the specified format Read more
Source§

fn save_to_writer<W: Write>( &self, writer: &mut W, format: Format, ) -> SerializationResult<()>

Save the object to a writer in the specified format Read more
Source§

fn load<P: AsRef<Path>>(path: P, format: Format) -> SerializationResult<Self>

Load an object from a file in the specified format Read more
Source§

fn load_from_reader<R: Read>( reader: &mut R, format: Format, ) -> SerializationResult<Self>

Load an object from a reader in the specified format Read more
Source§

impl StructSerializable for Adam

Source§

fn to_serializer(&self) -> StructSerializer

Convert Adam to StructSerializer for serialization

Serializes all optimizer state including configuration, parameter states, and global step count. Parameter linking is not serialized and must be done after deserialization.

§Returns

StructSerializer containing all serializable optimizer state

Source§

fn from_deserializer( deserializer: &mut StructDeserializer, ) -> SerializationResult<Self>

Create Adam from StructDeserializer

Reconstructs Adam optimizer from serialized state. Parameters must be linked separately using add_parameter or add_parameters.

§Arguments
  • deserializer - StructDeserializer containing optimizer data
§Returns

Reconstructed Adam instance without parameter links, or error if deserialization fails

Source§

fn save_json<P: AsRef<Path>>(&self, path: P) -> SerializationResult<()>

Saves the struct to a JSON file Read more
Source§

fn save_binary<P: AsRef<Path>>(&self, path: P) -> SerializationResult<()>

Saves the struct to a binary file Read more
Source§

fn load_json<P: AsRef<Path>>(path: P) -> SerializationResult<Self>

Loads the struct from a JSON file Read more
Source§

fn load_binary<P: AsRef<Path>>(path: P) -> SerializationResult<Self>

Loads the struct from a binary file Read more
Source§

fn to_json(&self) -> SerializationResult<String>

Converts the struct to a JSON string Read more
Source§

fn to_binary(&self) -> SerializationResult<Vec<u8>>

Converts the struct to binary data Read more
Source§

fn from_json(json: &str) -> SerializationResult<Self>

Creates the struct from a JSON string Read more
Source§

fn from_binary(data: &[u8]) -> SerializationResult<Self>

Creates the struct from binary data Read more

Auto Trait Implementations§

§

impl Freeze for Adam

§

impl RefUnwindSafe for Adam

§

impl Send for Adam

§

impl Sync for Adam

§

impl Unpin for Adam

§

impl UnwindSafe for Adam

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToFieldValue for T
where T: Serializable,

Source§

fn to_field_value(&self) -> FieldValue

Converts the value to a FieldValue for serialization Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.