pub struct Adam { /* private fields */ }Expand description
Adam optimizer for neural network parameter optimization
Implements the Adam optimization algorithm with PyTorch-compatible interface. Provides adaptive learning rates with momentum for efficient training of neural networks. The optimizer maintains per-parameter state for momentum and velocity estimates, enabling adaptive learning rates that improve convergence across diverse architectures.
§Usage Pattern
The optimizer uses ID-based parameter linking for maximum flexibility and thread safety:
- Parameters are linked to the optimizer via
add_parameteroradd_parameters - The
stepmethod takes mutable references to parameters for thread-safe updates - Parameter states are maintained by tensor ID, allowing for dynamic parameter management
- Supports serialization and deserialization with parameter re-linking
§Dynamic Parameter Management
Parameters can be added, removed, or re-linked at runtime:
add_parameter: Link a single parameteradd_parameters: Link multiple parameters at onceunlink_parameter: Remove parameter state by IDclear_states: Remove all parameter statesis_parameter_linked: Check if a parameter is linked
§Serialization Support
The optimizer supports full serialization and deserialization with state preservation:
- Parameter states are saved with their shapes and insertion order for validation
- After deserialization, use
relink_parametersto restore saved states to new tensors - Parameters must be re-linked in the same chronological order they were originally added
- Shape validation ensures consistency between saved and current parameters
§Features
- ID-Based Parameter Linking: Dynamic parameter management via tensor IDs
- Thread-Safe Step Method: Takes mutable references for safe concurrent access
- Per-Parameter State: Each parameter maintains its own momentum and velocity buffers
- Bias Correction: Automatically corrects initialization bias in moment estimates
- Weight Decay: Optional L2 regularization with efficient implementation
- AMSGrad Support: Optional AMSGrad variant for improved convergence stability
- SIMD Optimization: AVX2-optimized updates for maximum performance
- Full Serialization: Complete state persistence and restoration
§Thread Safety
This type is thread-safe and can be shared between threads. The step method takes mutable references to parameters, ensuring exclusive access during updates.
Implementations§
Source§impl Adam
impl Adam
Sourcepub fn saved_parameter_count(&self) -> usize
pub fn saved_parameter_count(&self) -> usize
Get the number of saved parameter states for checkpoint validation
This method returns the count of parameter states currently stored in the optimizer, which is essential for validating checkpoint integrity and ensuring proper parameter re-linking after deserialization. The count includes all parameters that have been linked to the optimizer and have accumulated optimization state.
§Returns
Number of parameter states currently stored in the optimizer
§Usage Patterns
§Checkpoint Validation
After deserializing an optimizer, this method helps verify that the expected number of parameters were saved and can guide the re-linking process.
§Training Resumption
When resuming training, compare this count with the number of parameters in your model to ensure checkpoint compatibility.
§State Management
Use this method to monitor optimizer state growth and memory usage during training with dynamic parameter addition.
§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;
let weight = Tensor::ones(vec![10, 5]).with_requires_grad();
let bias = Tensor::zeros(vec![5]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);
optimizer.add_parameter(&bias);
// Check parameter count before serialization
assert_eq!(optimizer.saved_parameter_count(), 2);
// Serialize and deserialize
let json = optimizer.to_json().unwrap();
let loaded_optimizer = Adam::from_json(&json).unwrap();
// Verify parameter count is preserved
assert_eq!(loaded_optimizer.saved_parameter_count(), 2);§Performance
- Time Complexity: O(1) - Direct access to internal state count
- Memory Usage: No additional memory allocation
- Thread Safety: Safe to call from multiple threads concurrently
Source§impl Adam
impl Adam
Sourcepub fn new() -> Self
pub fn new() -> Self
Create a new Adam optimizer with default configuration
Initializes an Adam optimizer with PyTorch-compatible default hyperparameters.
Parameters must be linked separately using add_parameter or add_parameters.
§Returns
A new Adam optimizer instance with default hyperparameters
Examples found in repository?
47fn demonstrate_basic_optimizer_setup() {
48 println!("--- Basic Optimizer Setup ---");
49
50 // Create parameters that require gradients
51 let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52 let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54 println!("Created parameters:");
55 println!(
56 " Weight: shape {:?}, requires_grad: {}",
57 weight.shape().dims(),
58 weight.requires_grad()
59 );
60 println!(
61 " Bias: shape {:?}, requires_grad: {}",
62 bias.shape().dims(),
63 bias.requires_grad()
64 );
65
66 // Create Adam optimizer with default configuration
67 let mut optimizer = Adam::new();
68 println!(
69 "Created Adam optimizer with learning rate: {}",
70 optimizer.learning_rate()
71 );
72
73 // Add parameters to optimizer
74 optimizer.add_parameter(&weight);
75 optimizer.add_parameter(&bias);
76 println!(
77 "Added {} parameters to optimizer",
78 optimizer.parameter_count()
79 );
80
81 // Create optimizer with custom configuration
82 let config = AdamConfig {
83 learning_rate: 0.01,
84 beta1: 0.9,
85 beta2: 0.999,
86 eps: 1e-8,
87 weight_decay: 0.0,
88 amsgrad: false,
89 };
90
91 let mut custom_optimizer = Adam::with_config(config);
92 custom_optimizer.add_parameter(&weight);
93 custom_optimizer.add_parameter(&bias);
94
95 println!(
96 "Created custom optimizer with learning rate: {}",
97 custom_optimizer.learning_rate()
98 );
99
100 // Demonstrate parameter linking
101 println!("Parameter linking completed successfully");
102}More examples
84fn demonstrate_default_adam() -> Result<(), Box<dyn std::error::Error>> {
85 println!("--- Default Adam Configuration ---");
86
87 // Create a simple regression problem: y = 2*x + 1
88 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
89 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
90
91 // Create model parameters
92 let mut weight = Tensor::randn(vec![1, 1], Some(42)).with_requires_grad();
93 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
94
95 // Create Adam optimizer with default configuration
96 let mut optimizer = Adam::new();
97 optimizer.add_parameter(&weight);
98 optimizer.add_parameter(&bias);
99
100 println!("Default Adam configuration:");
101 println!(" Learning rate: {}", optimizer.learning_rate());
102 println!(" Initial weight: {:.6}", weight.value());
103 println!(" Initial bias: {:.6}", bias.value());
104
105 // Training loop
106 let num_epochs = 50;
107 let mut losses = Vec::new();
108
109 for epoch in 0..num_epochs {
110 // Forward pass
111 let y_pred = x_data.matmul(&weight) + &bias;
112 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
113
114 // Backward pass
115 loss.backward(None);
116
117 // Optimizer step
118 optimizer.step(&mut [&mut weight, &mut bias]);
119 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
120
121 losses.push(loss.value());
122
123 if epoch % 10 == 0 || epoch == num_epochs - 1 {
124 println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
125 }
126 }
127
128 // Evaluate final model
129 let _final_predictions = x_data.matmul(&weight) + &bias;
130 println!("\nFinal model:");
131 println!(" Learned weight: {:.6} (target: 2.0)", weight.value());
132 println!(" Learned bias: {:.6} (target: 1.0)", bias.value());
133 println!(" Final loss: {:.6}", losses[losses.len() - 1]);
134
135 Ok(())
136}Sourcepub fn with_config(config: AdamConfig) -> Self
pub fn with_config(config: AdamConfig) -> Self
Create a new Adam optimizer with custom configuration
Allows full control over all Adam hyperparameters for specialized training
scenarios such as fine-tuning, transfer learning, or research applications.
Parameters must be linked separately using add_parameter or add_parameters.
§Arguments
config- Adam configuration with custom hyperparameters
§Returns
A new Adam optimizer instance with the specified configuration
Examples found in repository?
47fn demonstrate_basic_optimizer_setup() {
48 println!("--- Basic Optimizer Setup ---");
49
50 // Create parameters that require gradients
51 let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52 let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54 println!("Created parameters:");
55 println!(
56 " Weight: shape {:?}, requires_grad: {}",
57 weight.shape().dims(),
58 weight.requires_grad()
59 );
60 println!(
61 " Bias: shape {:?}, requires_grad: {}",
62 bias.shape().dims(),
63 bias.requires_grad()
64 );
65
66 // Create Adam optimizer with default configuration
67 let mut optimizer = Adam::new();
68 println!(
69 "Created Adam optimizer with learning rate: {}",
70 optimizer.learning_rate()
71 );
72
73 // Add parameters to optimizer
74 optimizer.add_parameter(&weight);
75 optimizer.add_parameter(&bias);
76 println!(
77 "Added {} parameters to optimizer",
78 optimizer.parameter_count()
79 );
80
81 // Create optimizer with custom configuration
82 let config = AdamConfig {
83 learning_rate: 0.01,
84 beta1: 0.9,
85 beta2: 0.999,
86 eps: 1e-8,
87 weight_decay: 0.0,
88 amsgrad: false,
89 };
90
91 let mut custom_optimizer = Adam::with_config(config);
92 custom_optimizer.add_parameter(&weight);
93 custom_optimizer.add_parameter(&bias);
94
95 println!(
96 "Created custom optimizer with learning rate: {}",
97 custom_optimizer.learning_rate()
98 );
99
100 // Demonstrate parameter linking
101 println!("Parameter linking completed successfully");
102}More examples
109fn demonstrate_optimizer_serialization() -> Result<(), Box<dyn std::error::Error>> {
110 println!("\n--- Optimizer Serialization ---");
111
112 // Create an optimizer with some parameters
113 let mut weight = Tensor::randn(vec![2, 2], Some(42)).with_requires_grad();
114 let mut bias = Tensor::randn(vec![2], Some(43)).with_requires_grad();
115
116 let config = AdamConfig {
117 learning_rate: 0.001,
118 beta1: 0.9,
119 beta2: 0.999,
120 eps: 1e-8,
121 weight_decay: 0.0,
122 amsgrad: false,
123 };
124
125 let mut optimizer = Adam::with_config(config);
126 optimizer.add_parameter(&weight);
127 optimizer.add_parameter(&bias);
128
129 println!(
130 "Created optimizer with {} parameters",
131 optimizer.parameter_count()
132 );
133 println!("Learning rate: {}", optimizer.learning_rate());
134
135 // Simulate some training steps
136 for _ in 0..3 {
137 let mut loss = weight.sum() + bias.sum();
138 loss.backward(None);
139 optimizer.step(&mut [&mut weight, &mut bias]);
140 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
141 }
142
143 // Save optimizer state
144 let optimizer_path = "temp_optimizer.json";
145 optimizer.save_json(optimizer_path)?;
146 println!("Saved optimizer to: {}", optimizer_path);
147
148 // Load optimizer state
149 let loaded_optimizer = Adam::load_json(optimizer_path)?;
150 println!(
151 "Loaded optimizer with {} parameters",
152 loaded_optimizer.parameter_count()
153 );
154 println!("Learning rate: {}", loaded_optimizer.learning_rate());
155
156 // Verify optimizer state
157 assert_eq!(
158 optimizer.parameter_count(),
159 loaded_optimizer.parameter_count()
160 );
161 assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
162 println!("Optimizer serialization verification: PASSED");
163
164 Ok(())
165}317fn train_with_config(config: TrainingConfig) -> Result<TrainingStats, Box<dyn std::error::Error>> {
318 // Create training data
319 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
320 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
321
322 // Create model parameters
323 let mut weight = Tensor::randn(vec![1, 1], Some(123)).with_requires_grad();
324 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
325
326 // Create optimizer with custom configuration
327 let adam_config = AdamConfig {
328 learning_rate: config.learning_rate,
329 beta1: config.beta1,
330 beta2: config.beta2,
331 eps: 1e-8,
332 weight_decay: config.weight_decay,
333 amsgrad: false,
334 };
335
336 let mut optimizer = Adam::with_config(adam_config);
337 optimizer.add_parameter(&weight);
338 optimizer.add_parameter(&bias);
339
340 // Training loop
341 let mut losses = Vec::new();
342 let mut convergence_epoch = config.epochs;
343
344 for epoch in 0..config.epochs {
345 // Forward pass
346 let y_pred = x_data.matmul(&weight) + &bias;
347 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
348
349 // Backward pass
350 loss.backward(None);
351
352 // Optimizer step
353 optimizer.step(&mut [&mut weight, &mut bias]);
354 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
355
356 let loss_value = loss.value();
357 losses.push(loss_value);
358
359 // Check for convergence (loss < 0.01)
360 if loss_value < 0.01 && convergence_epoch == config.epochs {
361 convergence_epoch = epoch;
362 }
363 }
364
365 Ok(TrainingStats {
366 config,
367 final_loss: losses[losses.len() - 1],
368 loss_history: losses,
369 convergence_epoch,
370 weight_norm: weight.norm().value(),
371 })
372}218fn demonstrate_training_loop() -> Result<(), Box<dyn std::error::Error>> {
219 println!("\n--- Training Loop ---");
220
221 // Create layer and training data
222 let mut layer = LinearLayer::new(2, 1, Some(45));
223
224 // Simple regression task: y = 2*x1 + 3*x2 + 1
225 let x_data = Tensor::from_slice(
226 &[
227 1.0, 1.0, // x1=1, x2=1 -> y=6
228 2.0, 1.0, // x1=2, x2=1 -> y=8
229 1.0, 2.0, // x1=1, x2=2 -> y=9
230 2.0, 2.0, // x1=2, x2=2 -> y=11
231 ],
232 vec![4, 2],
233 )
234 .unwrap();
235
236 let y_true = Tensor::from_slice(&[6.0, 8.0, 9.0, 11.0], vec![4, 1]).unwrap();
237
238 println!("Training data:");
239 println!(" X shape: {:?}", x_data.shape().dims());
240 println!(" Y shape: {:?}", y_true.shape().dims());
241 println!(" Target function: y = 2*x1 + 3*x2 + 1");
242
243 // Create optimizer
244 let config = AdamConfig {
245 learning_rate: 0.01,
246 beta1: 0.9,
247 beta2: 0.999,
248 eps: 1e-8,
249 weight_decay: 0.0,
250 amsgrad: false,
251 };
252
253 let mut optimizer = Adam::with_config(config);
254 let params = layer.parameters();
255 for param in ¶ms {
256 optimizer.add_parameter(param);
257 }
258
259 println!("Optimizer setup complete. Starting training...");
260
261 // Training loop
262 let num_epochs = 100;
263 let mut losses = Vec::new();
264
265 for epoch in 0..num_epochs {
266 // Forward pass
267 let y_pred = layer.forward(&x_data);
268
269 // Compute loss: MSE
270 let diff = y_pred.sub_tensor(&y_true);
271 let mut loss = diff.pow_scalar(2.0).mean();
272
273 // Backward pass
274 loss.backward(None);
275
276 // Optimizer step
277 let mut params = layer.parameters();
278 optimizer.step(&mut params);
279 optimizer.zero_grad(&mut params);
280
281 losses.push(loss.value());
282
283 // Print progress
284 if epoch % 20 == 0 || epoch == num_epochs - 1 {
285 println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
286 }
287 }
288
289 // Evaluate final model
290 let final_predictions = layer.forward_no_grad(&x_data);
291
292 println!("\nFinal model evaluation:");
293 println!(" Learned weights: {:?}", layer.weight.data());
294 println!(" Learned bias: {:?}", layer.bias.data());
295 println!(" Target weights: [2.0, 3.0]");
296 println!(" Target bias: [1.0]");
297
298 println!(" Predictions vs True:");
299 for i in 0..4 {
300 let pred = final_predictions.data()[i];
301 let true_val = y_true.data()[i];
302 println!(
303 " Sample {}: pred={:.3}, true={:.1}, error={:.3}",
304 i + 1,
305 pred,
306 true_val,
307 (pred - true_val).abs()
308 );
309 }
310
311 // Training analysis
312 let initial_loss = losses[0];
313 let final_loss = losses[losses.len() - 1];
314 let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
315
316 println!("\nTraining Analysis:");
317 println!(" Initial loss: {:.6}", initial_loss);
318 println!(" Final loss: {:.6}", final_loss);
319 println!(" Loss reduction: {:.1}%", loss_reduction);
320
321 Ok(())
322}Sourcepub fn with_learning_rate(learning_rate: f32) -> Self
pub fn with_learning_rate(learning_rate: f32) -> Self
Create a new Adam optimizer with custom learning rate
A convenience constructor that allows setting only the learning rate while
using default values for all other hyperparameters. Parameters must be
linked separately using add_parameter or add_parameters.
§Arguments
learning_rate- Learning rate for optimization
§Returns
A new Adam optimizer instance with the specified learning rate and default values for all other hyperparameters
Examples found in repository?
319fn train_with_scheduler(
320 scheduler: &mut dyn LearningRateScheduler,
321 num_epochs: usize,
322) -> Result<TrainingStats, Box<dyn std::error::Error>> {
323 // Create training data: y = 2*x + 1
324 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
325 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
326
327 // Create model parameters
328 let mut weight = Tensor::randn(vec![1, 1], Some(456)).with_requires_grad();
329 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
330
331 // Create optimizer with initial learning rate
332 let mut optimizer = Adam::with_learning_rate(0.05);
333 optimizer.add_parameter(&weight);
334 optimizer.add_parameter(&bias);
335
336 // Training loop
337 let mut losses = Vec::new();
338 let mut lr_history = Vec::new();
339 let mut convergence_epoch = num_epochs;
340
341 for epoch in 0..num_epochs {
342 // Forward pass
343 let y_pred = x_data.matmul(&weight) + &bias;
344 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
345
346 // Backward pass
347 loss.backward(None);
348
349 // Update learning rate using scheduler
350 let current_lr = optimizer.learning_rate();
351 let new_lr = scheduler.step(current_lr, epoch, loss.value());
352
353 if (new_lr - current_lr).abs() > 1e-8 {
354 optimizer.set_learning_rate(new_lr);
355 }
356
357 // Optimizer step
358 optimizer.step(&mut [&mut weight, &mut bias]);
359 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
360
361 let loss_value = loss.value();
362 losses.push(loss_value);
363 lr_history.push(new_lr);
364
365 // Check for convergence
366 if loss_value < 0.01 && convergence_epoch == num_epochs {
367 convergence_epoch = epoch;
368 }
369 }
370
371 Ok(TrainingStats {
372 scheduler_name: scheduler.name().to_string(),
373 final_loss: losses[losses.len() - 1],
374 lr_history,
375 loss_history: losses,
376 convergence_epoch,
377 })
378}More examples
371fn demonstrate_serialization() -> Result<(), Box<dyn std::error::Error>> {
372 println!("\n--- Serialization ---");
373
374 // Create and train a simple layer
375 let mut original_layer = LinearLayer::new(2, 1, Some(47));
376
377 // Simple training data
378 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
379 let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
380
381 let mut optimizer = Adam::with_learning_rate(0.01);
382 let params = original_layer.parameters();
383 for param in ¶ms {
384 optimizer.add_parameter(param);
385 }
386
387 // Train for a few epochs
388 for _ in 0..10 {
389 let y_pred = original_layer.forward(&x_data);
390 let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
391 loss.backward(None);
392
393 let mut params = original_layer.parameters();
394 optimizer.step(&mut params);
395 optimizer.zero_grad(&mut params);
396 }
397
398 println!("Original layer trained");
399 println!(" Weight: {:?}", original_layer.weight.data());
400 println!(" Bias: {:?}", original_layer.bias.data());
401
402 // Save layer
403 original_layer.save_json("temp_linear_layer")?;
404
405 // Load layer
406 let loaded_layer = LinearLayer::load_json("temp_linear_layer", 2, 1)?;
407
408 println!("Loaded layer");
409 println!(" Weight: {:?}", loaded_layer.weight.data());
410 println!(" Bias: {:?}", loaded_layer.bias.data());
411
412 // Verify consistency
413 let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
414 let original_output = original_layer.forward_no_grad(&test_input);
415 let loaded_output = loaded_layer.forward_no_grad(&test_input);
416
417 println!("Consistency check:");
418 println!(" Original output: {:?}", original_output.data());
419 println!(" Loaded output: {:?}", loaded_output.data());
420 println!(
421 " Match: {}",
422 original_output
423 .data()
424 .iter()
425 .zip(loaded_output.data().iter())
426 .all(|(a, b)| (a - b).abs() < 1e-6)
427 );
428
429 println!("Serialization verification: PASSED");
430
431 Ok(())
432}204fn demonstrate_model_checkpointing() -> Result<(), Box<dyn std::error::Error>> {
205 println!("\n--- Model Checkpointing ---");
206
207 // Create a simple model (weights and bias)
208 let mut weights = Tensor::randn(vec![2, 1], Some(45)).with_requires_grad();
209 let mut bias = Tensor::randn(vec![1], Some(46)).with_requires_grad();
210
211 // Create optimizer
212 let mut optimizer = Adam::with_learning_rate(0.01);
213 optimizer.add_parameter(&weights);
214 optimizer.add_parameter(&bias);
215
216 println!("Initial weights: {:?}", weights.data());
217 println!("Initial bias: {:?}", bias.data());
218
219 // Simulate training
220 for epoch in 0..5 {
221 let mut loss = weights.sum() + bias.sum();
222 loss.backward(None);
223 optimizer.step(&mut [&mut weights, &mut bias]);
224 optimizer.zero_grad(&mut [&mut weights, &mut bias]);
225
226 if epoch % 2 == 0 {
227 // Save checkpoint
228 let checkpoint_dir = format!("checkpoint_epoch_{}", epoch);
229 fs::create_dir_all(&checkpoint_dir)?;
230
231 weights.save_json(format!("{}/weights.json", checkpoint_dir))?;
232 bias.save_json(format!("{}/bias.json", checkpoint_dir))?;
233 optimizer.save_json(format!("{}/optimizer.json", checkpoint_dir))?;
234
235 println!("Saved checkpoint for epoch {}", epoch);
236 }
237 }
238
239 // Load from checkpoint
240 let loaded_weights = Tensor::load_json("checkpoint_epoch_4/weights.json")?;
241 let loaded_bias = Tensor::load_json("checkpoint_epoch_4/bias.json")?;
242 let loaded_optimizer = Adam::load_json("checkpoint_epoch_4/optimizer.json")?;
243
244 println!("Loaded weights: {:?}", loaded_weights.data());
245 println!("Loaded bias: {:?}", loaded_bias.data());
246 println!(
247 "Loaded optimizer learning rate: {}",
248 loaded_optimizer.learning_rate()
249 );
250
251 // Verify checkpoint integrity
252 assert_eq!(weights.shape().dims(), loaded_weights.shape().dims());
253 assert_eq!(bias.shape().dims(), loaded_bias.shape().dims());
254 assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
255
256 println!("Checkpointing verification: PASSED");
257
258 Ok(())
259}105fn demonstrate_linear_regression() -> Result<(), Box<dyn std::error::Error>> {
106 println!("\n--- Linear Regression Training ---");
107
108 // Create model parameters
109 let mut weight = Tensor::randn(vec![1, 1], Some(43)).with_requires_grad();
110 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
111
112 // Create optimizer
113 let mut optimizer = Adam::with_learning_rate(0.01);
114 optimizer.add_parameter(&weight);
115 optimizer.add_parameter(&bias);
116
117 // Create simple training data: y = 2*x + 1
118 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
119 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
120
121 println!("Training data:");
122 println!(" X: {:?}", x_data.data());
123 println!(" Y: {:?}", y_true.data());
124 println!(" Target: y = 2*x + 1");
125
126 // Training loop
127 let num_epochs = 100;
128 let mut losses = Vec::new();
129
130 for epoch in 0..num_epochs {
131 // Forward pass: y_pred = x * weight + bias
132 let y_pred = x_data.matmul(&weight) + &bias;
133
134 // Compute loss: MSE
135 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
136
137 // Backward pass
138 loss.backward(None);
139
140 // Optimizer step
141 optimizer.step(&mut [&mut weight, &mut bias]);
142 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
143
144 losses.push(loss.value());
145
146 // Print progress every 20 epochs
147 if epoch % 20 == 0 || epoch == num_epochs - 1 {
148 println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
149 }
150 }
151
152 // Evaluate final model
153 let final_predictions = x_data.matmul(&weight) + &bias;
154 println!("\nFinal model evaluation:");
155 println!(" Learned weight: {:.6}", weight.value());
156 println!(" Learned bias: {:.6}", bias.value());
157 println!(" Predictions vs True:");
158
159 for i in 0..5 {
160 let x1 = x_data.data()[i];
161 let pred = final_predictions.data()[i];
162 let true_val = y_true.data()[i];
163 println!(
164 " x={:.1}: pred={:.3}, true={:.1}, error={:.3}",
165 x1,
166 pred,
167 true_val,
168 (pred - true_val).abs()
169 );
170 }
171
172 Ok(())
173}
174
175/// Demonstrate advanced training patterns
176fn demonstrate_advanced_training() -> Result<(), Box<dyn std::error::Error>> {
177 println!("\n--- Advanced Training Patterns ---");
178
179 // Create a more complex model
180 let mut weight = Tensor::randn(vec![1, 2], Some(44)).with_requires_grad();
181 let mut bias = Tensor::zeros(vec![2]).with_requires_grad();
182
183 // Create optimizer with different learning rate
184 let mut optimizer = Adam::with_learning_rate(0.005);
185 optimizer.add_parameter(&weight);
186 optimizer.add_parameter(&bias);
187
188 // Create training data: y = 2*x + [1, 3]
189 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
190 let y_true = Tensor::from_slice(
191 &[3.0, 5.0, 7.0, 9.0, 11.0, 6.0, 8.0, 10.0, 12.0, 14.0],
192 vec![5, 2],
193 )
194 .unwrap();
195
196 println!("Advanced training with monitoring:");
197 println!(" Initial learning rate: {}", optimizer.learning_rate());
198
199 // Training loop with monitoring
200 let num_epochs = 50;
201 let mut losses = Vec::new();
202 let mut weight_norms = Vec::new();
203 let mut gradient_norms = Vec::new();
204
205 for epoch in 0..num_epochs {
206 // Forward pass
207 let y_pred = x_data.matmul(&weight) + &bias;
208 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
209
210 // Backward pass
211 loss.backward(None);
212
213 // Compute gradient norm before optimizer step
214 let gradient_norm = weight.grad_owned().unwrap().norm();
215
216 // Optimizer step
217 optimizer.step(&mut [&mut weight, &mut bias]);
218 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
219
220 // Learning rate scheduling: reduce every 10 epochs
221 if epoch > 0 && epoch % 10 == 0 {
222 let current_lr = optimizer.learning_rate();
223 let new_lr = current_lr * 0.5;
224 optimizer.set_learning_rate(new_lr);
225 println!(
226 "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
227 epoch, current_lr, new_lr
228 );
229 }
230
231 // Record metrics
232 losses.push(loss.value());
233 weight_norms.push(weight.norm().value());
234 gradient_norms.push(gradient_norm.value());
235
236 // Print detailed progress
237 if epoch % 10 == 0 || epoch == num_epochs - 1 {
238 println!(
239 "Epoch {:2}: Loss = {:.6}, Weight Norm = {:.6}, Gradient Norm = {:.6}",
240 epoch,
241 loss.value(),
242 weight.norm().value(),
243 gradient_norm.value()
244 );
245 }
246 }
247
248 println!("Final learning rate: {}", optimizer.learning_rate());
249
250 // Analyze training progression
251 let initial_loss = losses[0];
252 let final_loss = losses[losses.len() - 1];
253 let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
254
255 println!("\nTraining Analysis:");
256 println!(" Initial loss: {:.6}", initial_loss);
257 println!(" Final loss: {:.6}", final_loss);
258 println!(" Loss reduction: {:.1}%", loss_reduction);
259 println!(" Final weight norm: {:.6}", weight.norm().value());
260 println!(" Final bias: {:?}", bias.data());
261
262 Ok(())
263}
264
265/// Demonstrate learning rate scheduling
266fn demonstrate_learning_rate_scheduling() -> Result<(), Box<dyn std::error::Error>> {
267 println!("\n--- Learning Rate Scheduling ---");
268
269 // Create simple model
270 let mut weight = Tensor::randn(vec![1, 1], Some(45)).with_requires_grad();
271 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
272
273 // Create optimizer with high initial learning rate
274 let mut optimizer = Adam::with_learning_rate(0.1);
275 optimizer.add_parameter(&weight);
276 optimizer.add_parameter(&bias);
277
278 // Simple data
279 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0], vec![3, 1]).unwrap();
280 let y_true = Tensor::from_slice(&[2.0, 4.0, 6.0], vec![3, 1]).unwrap();
281
282 println!("Initial learning rate: {}", optimizer.learning_rate());
283
284 // Training loop with learning rate scheduling
285 let num_epochs = 50;
286 let mut losses = Vec::new();
287
288 for epoch in 0..num_epochs {
289 // Forward pass
290 let y_pred = x_data.matmul(&weight) + &bias;
291 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
292
293 // Backward pass
294 loss.backward(None);
295
296 // Optimizer step
297 optimizer.step(&mut [&mut weight, &mut bias]);
298 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
299
300 // Learning rate scheduling: reduce every 10 epochs
301 if epoch > 0 && epoch % 10 == 0 {
302 let current_lr = optimizer.learning_rate();
303 let new_lr = current_lr * 0.5;
304 optimizer.set_learning_rate(new_lr);
305 println!(
306 "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
307 epoch, current_lr, new_lr
308 );
309 }
310
311 losses.push(loss.value());
312
313 // Print progress
314 if epoch % 10 == 0 || epoch == num_epochs - 1 {
315 println!(
316 "Epoch {:2}: Loss = {:.6}, LR = {:.3}",
317 epoch,
318 loss.value(),
319 optimizer.learning_rate()
320 );
321 }
322 }
323
324 println!("Final learning rate: {}", optimizer.learning_rate());
325
326 Ok(())
327}
328
329/// Demonstrate training monitoring and analysis
330fn demonstrate_training_monitoring() -> Result<(), Box<dyn std::error::Error>> {
331 println!("\n--- Training Monitoring ---");
332
333 // Create model
334 let mut weight = Tensor::randn(vec![1, 1], Some(46)).with_requires_grad();
335 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
336
337 // Create optimizer
338 let mut optimizer = Adam::with_learning_rate(0.01);
339 optimizer.add_parameter(&weight);
340 optimizer.add_parameter(&bias);
341
342 // Training data
343 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![4, 1]).unwrap();
344 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0], vec![4, 1]).unwrap();
345
346 // Training loop with comprehensive monitoring
347 let num_epochs = 30;
348 let mut losses = Vec::new();
349 let mut weight_history = Vec::new();
350 let mut bias_history = Vec::new();
351
352 for epoch in 0..num_epochs {
353 // Forward pass
354 let y_pred = x_data.matmul(&weight) + &bias;
355 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
356
357 // Backward pass
358 loss.backward(None);
359
360 // Optimizer step
361 optimizer.step(&mut [&mut weight, &mut bias]);
362 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
363
364 // Record history
365 losses.push(loss.value());
366 weight_history.push(weight.value());
367 bias_history.push(bias.value());
368
369 // Print detailed monitoring
370 if epoch % 5 == 0 || epoch == num_epochs - 1 {
371 println!(
372 "Epoch {:2}: Loss = {:.6}, Weight = {:.6}, Bias = {:.6}",
373 epoch,
374 loss.value(),
375 weight.value(),
376 bias.value()
377 );
378 }
379 }
380
381 // Analyze training progression
382 println!("\nTraining Analysis:");
383 println!(" Initial loss: {:.6}", losses[0]);
384 println!(" Final loss: {:.6}", losses[losses.len() - 1]);
385 println!(
386 " Loss reduction: {:.1}%",
387 (losses[0] - losses[losses.len() - 1]) / losses[0] * 100.0
388 );
389
390 // Compute statistics
391 let loss_mean = compute_mean(&losses);
392 let loss_std = compute_std(&losses);
393 let weight_change = (weight_history[weight_history.len() - 1] - weight_history[0]).abs();
394 let bias_change = (bias_history[bias_history.len() - 1] - bias_history[0]).abs();
395
396 println!(" Average loss: {:.6} ± {:.6}", loss_mean, loss_std);
397 println!(" Weight change: {:.6}", weight_change);
398 println!(" Bias change: {:.6}", bias_change);
399 println!(" Final weight norm: {:.6}", weight.norm().value());
400 println!(" Final bias: {:.6}", bias.value());
401
402 Ok(())
403}431fn demonstrate_training_workflow() -> Result<(), Box<dyn std::error::Error>> {
432 println!("\n--- Training Workflow ---");
433
434 // Create a simple classification network
435 let config = FeedForwardConfig {
436 input_size: 2,
437 hidden_sizes: vec![4, 3],
438 output_size: 1,
439 use_bias: true,
440 };
441 let mut network = FeedForwardNetwork::new(config, Some(46));
442
443 println!("Training network: 2 -> [4, 3] -> 1");
444
445 // Create simple binary classification data: XOR problem
446 let x_data = Tensor::from_slice(
447 &[
448 0.0, 0.0, // -> 0
449 0.0, 1.0, // -> 1
450 1.0, 0.0, // -> 1
451 1.0, 1.0, // -> 0
452 ],
453 vec![4, 2],
454 )
455 .unwrap();
456
457 let y_true = Tensor::from_slice(&[0.0, 1.0, 1.0, 0.0], vec![4, 1]).unwrap();
458
459 println!("Training on XOR problem:");
460 println!(" Input shape: {:?}", x_data.shape().dims());
461 println!(" Target shape: {:?}", y_true.shape().dims());
462
463 // Create optimizer
464 let mut optimizer = Adam::with_learning_rate(0.1);
465 let params = network.parameters();
466 for param in ¶ms {
467 optimizer.add_parameter(param);
468 }
469
470 // Training loop
471 let num_epochs = 50;
472 let mut losses = Vec::new();
473
474 for epoch in 0..num_epochs {
475 // Forward pass
476 let y_pred = network.forward(&x_data);
477
478 // Compute loss: MSE
479 let diff = y_pred.sub_tensor(&y_true);
480 let mut loss = diff.pow_scalar(2.0).mean();
481
482 // Backward pass
483 loss.backward(None);
484
485 // Optimizer step and zero grad
486 let mut params = network.parameters();
487 optimizer.step(&mut params);
488 optimizer.zero_grad(&mut params);
489
490 losses.push(loss.value());
491
492 // Print progress
493 if epoch % 10 == 0 || epoch == num_epochs - 1 {
494 println!("Epoch {:2}: Loss = {:.6}", epoch, loss.value());
495 }
496 }
497
498 // Test final model
499 let final_predictions = network.forward_no_grad(&x_data);
500 println!("\nFinal predictions vs targets:");
501 for i in 0..4 {
502 let pred = final_predictions.data()[i];
503 let target = y_true.data()[i];
504 let input_x = x_data.data()[i * 2];
505 let input_y = x_data.data()[i * 2 + 1];
506 println!(
507 " [{:.0}, {:.0}] -> pred: {:.3}, target: {:.0}, error: {:.3}",
508 input_x,
509 input_y,
510 pred,
511 target,
512 (pred - target).abs()
513 );
514 }
515
516 Ok(())
517}
518
519/// Demonstrate comprehensive training with 100+ steps
520fn demonstrate_comprehensive_training() -> Result<(), Box<dyn std::error::Error>> {
521 println!("\n--- Comprehensive Training (100+ Steps) ---");
522
523 // Create a regression network
524 let config = FeedForwardConfig {
525 input_size: 3,
526 hidden_sizes: vec![8, 6, 4],
527 output_size: 2,
528 use_bias: true,
529 };
530 let mut network = FeedForwardNetwork::new(config, Some(47));
531
532 println!("Network architecture: 3 -> [8, 6, 4] -> 2");
533 println!("Total parameters: {}", network.parameter_count());
534
535 // Create synthetic regression data
536 // Target function: [y1, y2] = [x1 + 2*x2 - x3, x1*x2 + x3]
537 let num_samples = 32;
538 let mut x_vec = Vec::new();
539 let mut y_vec = Vec::new();
540
541 for i in 0..num_samples {
542 let x1 = (i as f32 / num_samples as f32) * 2.0 - 1.0; // [-1, 1]
543 let x2 = ((i * 2) as f32 / num_samples as f32) * 2.0 - 1.0;
544 let x3 = ((i * 3) as f32 / num_samples as f32) * 2.0 - 1.0;
545
546 let y1 = x1 + 2.0 * x2 - x3;
547 let y2 = x1 * x2 + x3;
548
549 x_vec.extend_from_slice(&[x1, x2, x3]);
550 y_vec.extend_from_slice(&[y1, y2]);
551 }
552
553 let x_data = Tensor::from_slice(&x_vec, vec![num_samples, 3]).unwrap();
554 let y_true = Tensor::from_slice(&y_vec, vec![num_samples, 2]).unwrap();
555
556 println!("Training data:");
557 println!(" {} samples", num_samples);
558 println!(" Input shape: {:?}", x_data.shape().dims());
559 println!(" Target shape: {:?}", y_true.shape().dims());
560
561 // Create optimizer with learning rate scheduling
562 let mut optimizer = Adam::with_learning_rate(0.01);
563 let params = network.parameters();
564 for param in ¶ms {
565 optimizer.add_parameter(param);
566 }
567
568 // Comprehensive training loop (150 epochs)
569 let num_epochs = 150;
570 let mut losses = Vec::new();
571 let mut best_loss = f32::INFINITY;
572 let mut patience_counter = 0;
573 let patience = 20;
574
575 println!("Starting comprehensive training...");
576
577 for epoch in 0..num_epochs {
578 // Forward pass
579 let y_pred = network.forward(&x_data);
580
581 // Compute loss: MSE
582 let diff = y_pred.sub_tensor(&y_true);
583 let mut loss = diff.pow_scalar(2.0).mean();
584
585 // Backward pass
586 loss.backward(None);
587
588 // Optimizer step and zero grad
589 let mut params = network.parameters();
590 optimizer.step(&mut params);
591 optimizer.zero_grad(&mut params);
592
593 let current_loss = loss.value();
594 losses.push(current_loss);
595
596 // Learning rate scheduling
597 if epoch > 0 && epoch % 30 == 0 {
598 let new_lr = optimizer.learning_rate() * 0.8;
599 optimizer.set_learning_rate(new_lr);
600 println!(" Reduced learning rate to {:.4}", new_lr);
601 }
602
603 // Early stopping logic
604 if current_loss < best_loss {
605 best_loss = current_loss;
606 patience_counter = 0;
607 } else {
608 patience_counter += 1;
609 }
610
611 // Print progress
612 if epoch % 25 == 0 || epoch == num_epochs - 1 {
613 println!(
614 "Epoch {:3}: Loss = {:.6}, LR = {:.4}, Best = {:.6}",
615 epoch,
616 current_loss,
617 optimizer.learning_rate(),
618 best_loss
619 );
620 }
621
622 // Early stopping
623 if patience_counter >= patience && epoch > 50 {
624 println!("Early stopping at epoch {} (patience exceeded)", epoch);
625 break;
626 }
627 }
628
629 // Final evaluation
630 let final_predictions = network.forward_no_grad(&x_data);
631
632 // Compute final metrics
633 let final_loss = losses[losses.len() - 1];
634 let initial_loss = losses[0];
635 let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
636
637 println!("\nTraining completed!");
638 println!(" Initial loss: {:.6}", initial_loss);
639 println!(" Final loss: {:.6}", final_loss);
640 println!(" Best loss: {:.6}", best_loss);
641 println!(" Loss reduction: {:.1}%", loss_reduction);
642 println!(" Final learning rate: {:.4}", optimizer.learning_rate());
643
644 // Sample predictions analysis
645 println!("\nSample predictions (first 5):");
646 for i in 0..5.min(num_samples) {
647 let pred1 = final_predictions.data()[i * 2];
648 let pred2 = final_predictions.data()[i * 2 + 1];
649 let true1 = y_true.data()[i * 2];
650 let true2 = y_true.data()[i * 2 + 1];
651
652 println!(
653 " Sample {}: pred=[{:.3}, {:.3}], true=[{:.3}, {:.3}], error=[{:.3}, {:.3}]",
654 i + 1,
655 pred1,
656 pred2,
657 true1,
658 true2,
659 (pred1 - true1).abs(),
660 (pred2 - true2).abs()
661 );
662 }
663
664 Ok(())
665}
666
667/// Demonstrate network serialization
668fn demonstrate_network_serialization() -> Result<(), Box<dyn std::error::Error>> {
669 println!("\n--- Network Serialization ---");
670
671 // Create and train a network
672 let config = FeedForwardConfig {
673 input_size: 2,
674 hidden_sizes: vec![4, 2],
675 output_size: 1,
676 use_bias: true,
677 };
678 let mut original_network = FeedForwardNetwork::new(config.clone(), Some(48));
679
680 // Quick training
681 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
682 let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
683
684 let mut optimizer = Adam::with_learning_rate(0.01);
685 let params = original_network.parameters();
686 for param in ¶ms {
687 optimizer.add_parameter(param);
688 }
689
690 for _ in 0..20 {
691 let y_pred = original_network.forward(&x_data);
692 let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
693 loss.backward(None);
694
695 let mut params = original_network.parameters();
696 optimizer.step(&mut params);
697 optimizer.zero_grad(&mut params);
698 }
699
700 // Test original network
701 let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
702 let original_output = original_network.forward_no_grad(&test_input);
703
704 println!("Original network output: {:?}", original_output.data());
705
706 // Save network
707 original_network.save_json("temp_feedforward_network")?;
708
709 // Load network
710 let loaded_network = FeedForwardNetwork::load_json("temp_feedforward_network", config)?;
711 let loaded_output = loaded_network.forward_no_grad(&test_input);
712
713 println!("Loaded network output: {:?}", loaded_output.data());
714
715 // Verify consistency
716 let match_check = original_output
717 .data()
718 .iter()
719 .zip(loaded_output.data().iter())
720 .all(|(a, b)| (a - b).abs() < 1e-6);
721
722 println!(
723 "Serialization verification: {}",
724 if match_check { "PASSED" } else { "FAILED" }
725 );
726
727 Ok(())
728}Sourcepub fn add_parameter(&mut self, parameter: &Tensor)
pub fn add_parameter(&mut self, parameter: &Tensor)
Add a single parameter to the optimizer
Links a parameter to the optimizer by creating a new parameter state
indexed by the tensor’s ID. The parameter must have requires_grad set to true.
§Arguments
parameter- Reference to the tensor to link
§Panics
Panics if the parameter does not have requires_grad set to true
Examples found in repository?
47fn demonstrate_basic_optimizer_setup() {
48 println!("--- Basic Optimizer Setup ---");
49
50 // Create parameters that require gradients
51 let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52 let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54 println!("Created parameters:");
55 println!(
56 " Weight: shape {:?}, requires_grad: {}",
57 weight.shape().dims(),
58 weight.requires_grad()
59 );
60 println!(
61 " Bias: shape {:?}, requires_grad: {}",
62 bias.shape().dims(),
63 bias.requires_grad()
64 );
65
66 // Create Adam optimizer with default configuration
67 let mut optimizer = Adam::new();
68 println!(
69 "Created Adam optimizer with learning rate: {}",
70 optimizer.learning_rate()
71 );
72
73 // Add parameters to optimizer
74 optimizer.add_parameter(&weight);
75 optimizer.add_parameter(&bias);
76 println!(
77 "Added {} parameters to optimizer",
78 optimizer.parameter_count()
79 );
80
81 // Create optimizer with custom configuration
82 let config = AdamConfig {
83 learning_rate: 0.01,
84 beta1: 0.9,
85 beta2: 0.999,
86 eps: 1e-8,
87 weight_decay: 0.0,
88 amsgrad: false,
89 };
90
91 let mut custom_optimizer = Adam::with_config(config);
92 custom_optimizer.add_parameter(&weight);
93 custom_optimizer.add_parameter(&bias);
94
95 println!(
96 "Created custom optimizer with learning rate: {}",
97 custom_optimizer.learning_rate()
98 );
99
100 // Demonstrate parameter linking
101 println!("Parameter linking completed successfully");
102}
103
104/// Demonstrate simple linear regression training
105fn demonstrate_linear_regression() -> Result<(), Box<dyn std::error::Error>> {
106 println!("\n--- Linear Regression Training ---");
107
108 // Create model parameters
109 let mut weight = Tensor::randn(vec![1, 1], Some(43)).with_requires_grad();
110 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
111
112 // Create optimizer
113 let mut optimizer = Adam::with_learning_rate(0.01);
114 optimizer.add_parameter(&weight);
115 optimizer.add_parameter(&bias);
116
117 // Create simple training data: y = 2*x + 1
118 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
119 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
120
121 println!("Training data:");
122 println!(" X: {:?}", x_data.data());
123 println!(" Y: {:?}", y_true.data());
124 println!(" Target: y = 2*x + 1");
125
126 // Training loop
127 let num_epochs = 100;
128 let mut losses = Vec::new();
129
130 for epoch in 0..num_epochs {
131 // Forward pass: y_pred = x * weight + bias
132 let y_pred = x_data.matmul(&weight) + &bias;
133
134 // Compute loss: MSE
135 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
136
137 // Backward pass
138 loss.backward(None);
139
140 // Optimizer step
141 optimizer.step(&mut [&mut weight, &mut bias]);
142 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
143
144 losses.push(loss.value());
145
146 // Print progress every 20 epochs
147 if epoch % 20 == 0 || epoch == num_epochs - 1 {
148 println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
149 }
150 }
151
152 // Evaluate final model
153 let final_predictions = x_data.matmul(&weight) + &bias;
154 println!("\nFinal model evaluation:");
155 println!(" Learned weight: {:.6}", weight.value());
156 println!(" Learned bias: {:.6}", bias.value());
157 println!(" Predictions vs True:");
158
159 for i in 0..5 {
160 let x1 = x_data.data()[i];
161 let pred = final_predictions.data()[i];
162 let true_val = y_true.data()[i];
163 println!(
164 " x={:.1}: pred={:.3}, true={:.1}, error={:.3}",
165 x1,
166 pred,
167 true_val,
168 (pred - true_val).abs()
169 );
170 }
171
172 Ok(())
173}
174
175/// Demonstrate advanced training patterns
176fn demonstrate_advanced_training() -> Result<(), Box<dyn std::error::Error>> {
177 println!("\n--- Advanced Training Patterns ---");
178
179 // Create a more complex model
180 let mut weight = Tensor::randn(vec![1, 2], Some(44)).with_requires_grad();
181 let mut bias = Tensor::zeros(vec![2]).with_requires_grad();
182
183 // Create optimizer with different learning rate
184 let mut optimizer = Adam::with_learning_rate(0.005);
185 optimizer.add_parameter(&weight);
186 optimizer.add_parameter(&bias);
187
188 // Create training data: y = 2*x + [1, 3]
189 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
190 let y_true = Tensor::from_slice(
191 &[3.0, 5.0, 7.0, 9.0, 11.0, 6.0, 8.0, 10.0, 12.0, 14.0],
192 vec![5, 2],
193 )
194 .unwrap();
195
196 println!("Advanced training with monitoring:");
197 println!(" Initial learning rate: {}", optimizer.learning_rate());
198
199 // Training loop with monitoring
200 let num_epochs = 50;
201 let mut losses = Vec::new();
202 let mut weight_norms = Vec::new();
203 let mut gradient_norms = Vec::new();
204
205 for epoch in 0..num_epochs {
206 // Forward pass
207 let y_pred = x_data.matmul(&weight) + &bias;
208 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
209
210 // Backward pass
211 loss.backward(None);
212
213 // Compute gradient norm before optimizer step
214 let gradient_norm = weight.grad_owned().unwrap().norm();
215
216 // Optimizer step
217 optimizer.step(&mut [&mut weight, &mut bias]);
218 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
219
220 // Learning rate scheduling: reduce every 10 epochs
221 if epoch > 0 && epoch % 10 == 0 {
222 let current_lr = optimizer.learning_rate();
223 let new_lr = current_lr * 0.5;
224 optimizer.set_learning_rate(new_lr);
225 println!(
226 "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
227 epoch, current_lr, new_lr
228 );
229 }
230
231 // Record metrics
232 losses.push(loss.value());
233 weight_norms.push(weight.norm().value());
234 gradient_norms.push(gradient_norm.value());
235
236 // Print detailed progress
237 if epoch % 10 == 0 || epoch == num_epochs - 1 {
238 println!(
239 "Epoch {:2}: Loss = {:.6}, Weight Norm = {:.6}, Gradient Norm = {:.6}",
240 epoch,
241 loss.value(),
242 weight.norm().value(),
243 gradient_norm.value()
244 );
245 }
246 }
247
248 println!("Final learning rate: {}", optimizer.learning_rate());
249
250 // Analyze training progression
251 let initial_loss = losses[0];
252 let final_loss = losses[losses.len() - 1];
253 let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
254
255 println!("\nTraining Analysis:");
256 println!(" Initial loss: {:.6}", initial_loss);
257 println!(" Final loss: {:.6}", final_loss);
258 println!(" Loss reduction: {:.1}%", loss_reduction);
259 println!(" Final weight norm: {:.6}", weight.norm().value());
260 println!(" Final bias: {:?}", bias.data());
261
262 Ok(())
263}
264
265/// Demonstrate learning rate scheduling
266fn demonstrate_learning_rate_scheduling() -> Result<(), Box<dyn std::error::Error>> {
267 println!("\n--- Learning Rate Scheduling ---");
268
269 // Create simple model
270 let mut weight = Tensor::randn(vec![1, 1], Some(45)).with_requires_grad();
271 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
272
273 // Create optimizer with high initial learning rate
274 let mut optimizer = Adam::with_learning_rate(0.1);
275 optimizer.add_parameter(&weight);
276 optimizer.add_parameter(&bias);
277
278 // Simple data
279 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0], vec![3, 1]).unwrap();
280 let y_true = Tensor::from_slice(&[2.0, 4.0, 6.0], vec![3, 1]).unwrap();
281
282 println!("Initial learning rate: {}", optimizer.learning_rate());
283
284 // Training loop with learning rate scheduling
285 let num_epochs = 50;
286 let mut losses = Vec::new();
287
288 for epoch in 0..num_epochs {
289 // Forward pass
290 let y_pred = x_data.matmul(&weight) + &bias;
291 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
292
293 // Backward pass
294 loss.backward(None);
295
296 // Optimizer step
297 optimizer.step(&mut [&mut weight, &mut bias]);
298 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
299
300 // Learning rate scheduling: reduce every 10 epochs
301 if epoch > 0 && epoch % 10 == 0 {
302 let current_lr = optimizer.learning_rate();
303 let new_lr = current_lr * 0.5;
304 optimizer.set_learning_rate(new_lr);
305 println!(
306 "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
307 epoch, current_lr, new_lr
308 );
309 }
310
311 losses.push(loss.value());
312
313 // Print progress
314 if epoch % 10 == 0 || epoch == num_epochs - 1 {
315 println!(
316 "Epoch {:2}: Loss = {:.6}, LR = {:.3}",
317 epoch,
318 loss.value(),
319 optimizer.learning_rate()
320 );
321 }
322 }
323
324 println!("Final learning rate: {}", optimizer.learning_rate());
325
326 Ok(())
327}
328
329/// Demonstrate training monitoring and analysis
330fn demonstrate_training_monitoring() -> Result<(), Box<dyn std::error::Error>> {
331 println!("\n--- Training Monitoring ---");
332
333 // Create model
334 let mut weight = Tensor::randn(vec![1, 1], Some(46)).with_requires_grad();
335 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
336
337 // Create optimizer
338 let mut optimizer = Adam::with_learning_rate(0.01);
339 optimizer.add_parameter(&weight);
340 optimizer.add_parameter(&bias);
341
342 // Training data
343 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![4, 1]).unwrap();
344 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0], vec![4, 1]).unwrap();
345
346 // Training loop with comprehensive monitoring
347 let num_epochs = 30;
348 let mut losses = Vec::new();
349 let mut weight_history = Vec::new();
350 let mut bias_history = Vec::new();
351
352 for epoch in 0..num_epochs {
353 // Forward pass
354 let y_pred = x_data.matmul(&weight) + &bias;
355 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
356
357 // Backward pass
358 loss.backward(None);
359
360 // Optimizer step
361 optimizer.step(&mut [&mut weight, &mut bias]);
362 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
363
364 // Record history
365 losses.push(loss.value());
366 weight_history.push(weight.value());
367 bias_history.push(bias.value());
368
369 // Print detailed monitoring
370 if epoch % 5 == 0 || epoch == num_epochs - 1 {
371 println!(
372 "Epoch {:2}: Loss = {:.6}, Weight = {:.6}, Bias = {:.6}",
373 epoch,
374 loss.value(),
375 weight.value(),
376 bias.value()
377 );
378 }
379 }
380
381 // Analyze training progression
382 println!("\nTraining Analysis:");
383 println!(" Initial loss: {:.6}", losses[0]);
384 println!(" Final loss: {:.6}", losses[losses.len() - 1]);
385 println!(
386 " Loss reduction: {:.1}%",
387 (losses[0] - losses[losses.len() - 1]) / losses[0] * 100.0
388 );
389
390 // Compute statistics
391 let loss_mean = compute_mean(&losses);
392 let loss_std = compute_std(&losses);
393 let weight_change = (weight_history[weight_history.len() - 1] - weight_history[0]).abs();
394 let bias_change = (bias_history[bias_history.len() - 1] - bias_history[0]).abs();
395
396 println!(" Average loss: {:.6} ± {:.6}", loss_mean, loss_std);
397 println!(" Weight change: {:.6}", weight_change);
398 println!(" Bias change: {:.6}", bias_change);
399 println!(" Final weight norm: {:.6}", weight.norm().value());
400 println!(" Final bias: {:.6}", bias.value());
401
402 Ok(())
403}More examples
109fn demonstrate_optimizer_serialization() -> Result<(), Box<dyn std::error::Error>> {
110 println!("\n--- Optimizer Serialization ---");
111
112 // Create an optimizer with some parameters
113 let mut weight = Tensor::randn(vec![2, 2], Some(42)).with_requires_grad();
114 let mut bias = Tensor::randn(vec![2], Some(43)).with_requires_grad();
115
116 let config = AdamConfig {
117 learning_rate: 0.001,
118 beta1: 0.9,
119 beta2: 0.999,
120 eps: 1e-8,
121 weight_decay: 0.0,
122 amsgrad: false,
123 };
124
125 let mut optimizer = Adam::with_config(config);
126 optimizer.add_parameter(&weight);
127 optimizer.add_parameter(&bias);
128
129 println!(
130 "Created optimizer with {} parameters",
131 optimizer.parameter_count()
132 );
133 println!("Learning rate: {}", optimizer.learning_rate());
134
135 // Simulate some training steps
136 for _ in 0..3 {
137 let mut loss = weight.sum() + bias.sum();
138 loss.backward(None);
139 optimizer.step(&mut [&mut weight, &mut bias]);
140 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
141 }
142
143 // Save optimizer state
144 let optimizer_path = "temp_optimizer.json";
145 optimizer.save_json(optimizer_path)?;
146 println!("Saved optimizer to: {}", optimizer_path);
147
148 // Load optimizer state
149 let loaded_optimizer = Adam::load_json(optimizer_path)?;
150 println!(
151 "Loaded optimizer with {} parameters",
152 loaded_optimizer.parameter_count()
153 );
154 println!("Learning rate: {}", loaded_optimizer.learning_rate());
155
156 // Verify optimizer state
157 assert_eq!(
158 optimizer.parameter_count(),
159 loaded_optimizer.parameter_count()
160 );
161 assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
162 println!("Optimizer serialization verification: PASSED");
163
164 Ok(())
165}
166
167/// Demonstrate format comparison and performance characteristics
168fn demonstrate_format_comparison() -> Result<(), Box<dyn std::error::Error>> {
169 println!("\n--- Format Comparison ---");
170
171 // Create a larger tensor for comparison
172 let tensor = Tensor::randn(vec![10, 10], Some(44));
173
174 // Save in both formats
175 tensor.save_json("temp_comparison.json")?;
176 tensor.save_binary("temp_comparison.bin")?;
177
178 // Compare file sizes
179 let json_size = fs::metadata("temp_comparison.json")?.len();
180 let binary_size = fs::metadata("temp_comparison.bin")?.len();
181
182 println!("JSON file size: {} bytes", json_size);
183 println!("Binary file size: {} bytes", binary_size);
184 println!(
185 "Compression ratio: {:.2}x",
186 json_size as f64 / binary_size as f64
187 );
188
189 // Load and verify both formats
190 let json_tensor = Tensor::load_json("temp_comparison.json")?;
191 let binary_tensor = Tensor::load_binary("temp_comparison.bin")?;
192
193 assert_eq!(tensor.shape().dims(), json_tensor.shape().dims());
194 assert_eq!(tensor.shape().dims(), binary_tensor.shape().dims());
195 assert_eq!(tensor.data(), json_tensor.data());
196 assert_eq!(tensor.data(), binary_tensor.data());
197
198 println!("Format comparison verification: PASSED");
199
200 Ok(())
201}
202
203/// Demonstrate a basic model checkpointing workflow
204fn demonstrate_model_checkpointing() -> Result<(), Box<dyn std::error::Error>> {
205 println!("\n--- Model Checkpointing ---");
206
207 // Create a simple model (weights and bias)
208 let mut weights = Tensor::randn(vec![2, 1], Some(45)).with_requires_grad();
209 let mut bias = Tensor::randn(vec![1], Some(46)).with_requires_grad();
210
211 // Create optimizer
212 let mut optimizer = Adam::with_learning_rate(0.01);
213 optimizer.add_parameter(&weights);
214 optimizer.add_parameter(&bias);
215
216 println!("Initial weights: {:?}", weights.data());
217 println!("Initial bias: {:?}", bias.data());
218
219 // Simulate training
220 for epoch in 0..5 {
221 let mut loss = weights.sum() + bias.sum();
222 loss.backward(None);
223 optimizer.step(&mut [&mut weights, &mut bias]);
224 optimizer.zero_grad(&mut [&mut weights, &mut bias]);
225
226 if epoch % 2 == 0 {
227 // Save checkpoint
228 let checkpoint_dir = format!("checkpoint_epoch_{}", epoch);
229 fs::create_dir_all(&checkpoint_dir)?;
230
231 weights.save_json(format!("{}/weights.json", checkpoint_dir))?;
232 bias.save_json(format!("{}/bias.json", checkpoint_dir))?;
233 optimizer.save_json(format!("{}/optimizer.json", checkpoint_dir))?;
234
235 println!("Saved checkpoint for epoch {}", epoch);
236 }
237 }
238
239 // Load from checkpoint
240 let loaded_weights = Tensor::load_json("checkpoint_epoch_4/weights.json")?;
241 let loaded_bias = Tensor::load_json("checkpoint_epoch_4/bias.json")?;
242 let loaded_optimizer = Adam::load_json("checkpoint_epoch_4/optimizer.json")?;
243
244 println!("Loaded weights: {:?}", loaded_weights.data());
245 println!("Loaded bias: {:?}", loaded_bias.data());
246 println!(
247 "Loaded optimizer learning rate: {}",
248 loaded_optimizer.learning_rate()
249 );
250
251 // Verify checkpoint integrity
252 assert_eq!(weights.shape().dims(), loaded_weights.shape().dims());
253 assert_eq!(bias.shape().dims(), loaded_bias.shape().dims());
254 assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
255
256 println!("Checkpointing verification: PASSED");
257
258 Ok(())
259}84fn demonstrate_default_adam() -> Result<(), Box<dyn std::error::Error>> {
85 println!("--- Default Adam Configuration ---");
86
87 // Create a simple regression problem: y = 2*x + 1
88 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
89 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
90
91 // Create model parameters
92 let mut weight = Tensor::randn(vec![1, 1], Some(42)).with_requires_grad();
93 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
94
95 // Create Adam optimizer with default configuration
96 let mut optimizer = Adam::new();
97 optimizer.add_parameter(&weight);
98 optimizer.add_parameter(&bias);
99
100 println!("Default Adam configuration:");
101 println!(" Learning rate: {}", optimizer.learning_rate());
102 println!(" Initial weight: {:.6}", weight.value());
103 println!(" Initial bias: {:.6}", bias.value());
104
105 // Training loop
106 let num_epochs = 50;
107 let mut losses = Vec::new();
108
109 for epoch in 0..num_epochs {
110 // Forward pass
111 let y_pred = x_data.matmul(&weight) + &bias;
112 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
113
114 // Backward pass
115 loss.backward(None);
116
117 // Optimizer step
118 optimizer.step(&mut [&mut weight, &mut bias]);
119 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
120
121 losses.push(loss.value());
122
123 if epoch % 10 == 0 || epoch == num_epochs - 1 {
124 println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
125 }
126 }
127
128 // Evaluate final model
129 let _final_predictions = x_data.matmul(&weight) + &bias;
130 println!("\nFinal model:");
131 println!(" Learned weight: {:.6} (target: 2.0)", weight.value());
132 println!(" Learned bias: {:.6} (target: 1.0)", bias.value());
133 println!(" Final loss: {:.6}", losses[losses.len() - 1]);
134
135 Ok(())
136}
137
138/// Demonstrate learning rate comparison
139fn demonstrate_learning_rate_comparison() -> Result<(), Box<dyn std::error::Error>> {
140 println!("\n--- Learning Rate Comparison ---");
141
142 let learning_rates = [0.001, 0.01, 0.1];
143 let mut results = Vec::new();
144
145 for &lr in &learning_rates {
146 println!("\nTesting learning rate: {}", lr);
147
148 let stats = train_with_config(TrainingConfig {
149 learning_rate: lr,
150 ..Default::default()
151 })?;
152
153 results.push((lr, stats.clone()));
154
155 println!(" Final loss: {:.6}", stats.final_loss);
156 println!(" Convergence epoch: {}", stats.convergence_epoch);
157 }
158
159 // Compare results
160 println!("\nLearning Rate Comparison Summary:");
161 for (lr, stats) in &results {
162 println!(
163 " LR={:6}: Loss={:.6}, Converged@{}",
164 lr, stats.final_loss, stats.convergence_epoch
165 );
166 }
167
168 Ok(())
169}
170
171/// Demonstrate weight decay comparison
172fn demonstrate_weight_decay_comparison() -> Result<(), Box<dyn std::error::Error>> {
173 println!("\n--- Weight Decay Comparison ---");
174
175 let weight_decays = [0.0, 0.001, 0.01];
176 let mut results = Vec::new();
177
178 for &wd in &weight_decays {
179 println!("\nTesting weight decay: {}", wd);
180
181 let stats = train_with_config(TrainingConfig {
182 weight_decay: wd,
183 ..Default::default()
184 })?;
185
186 results.push((wd, stats.clone()));
187
188 println!(" Final loss: {:.6}", stats.final_loss);
189 println!(" Final weight norm: {:.6}", stats.weight_norm);
190 }
191
192 // Compare results
193 println!("\nWeight Decay Comparison Summary:");
194 for (wd, stats) in &results {
195 println!(
196 " WD={:6}: Loss={:.6}, Weight Norm={:.6}",
197 wd, stats.final_loss, stats.weight_norm
198 );
199 }
200
201 Ok(())
202}
203
204/// Demonstrate beta parameter tuning
205fn demonstrate_beta_parameter_tuning() -> Result<(), Box<dyn std::error::Error>> {
206 println!("\n--- Beta Parameter Tuning ---");
207
208 let beta_configs = [
209 (0.9, 0.999), // Default
210 (0.8, 0.999), // More aggressive momentum
211 (0.95, 0.999), // Less aggressive momentum
212 (0.9, 0.99), // Faster second moment decay
213 ];
214
215 let mut results = Vec::new();
216
217 for (i, (beta1, beta2)) in beta_configs.iter().enumerate() {
218 println!(
219 "\nTesting beta configuration {}: beta1={}, beta2={}",
220 i + 1,
221 beta1,
222 beta2
223 );
224
225 let config = TrainingConfig {
226 beta1: *beta1,
227 beta2: *beta2,
228 ..Default::default()
229 };
230
231 let stats = train_with_config(config)?;
232 results.push(((*beta1, *beta2), stats.clone()));
233
234 println!(" Final loss: {:.6}", stats.final_loss);
235 println!(" Convergence epoch: {}", stats.convergence_epoch);
236 }
237
238 // Compare results
239 println!("\nBeta Parameter Comparison Summary:");
240 for ((beta1, beta2), stats) in &results {
241 println!(
242 " B1={:4}, B2={:5}: Loss={:.6}, Converged@{}",
243 beta1, beta2, stats.final_loss, stats.convergence_epoch
244 );
245 }
246
247 Ok(())
248}
249
250/// Demonstrate configuration benchmarking
251fn demonstrate_configuration_benchmarking() -> Result<(), Box<dyn std::error::Error>> {
252 println!("\n--- Configuration Benchmarking ---");
253
254 // Define configurations to benchmark
255 let configs = vec![
256 (
257 "Conservative",
258 TrainingConfig {
259 learning_rate: 0.001,
260 weight_decay: 0.001,
261 beta1: 0.95,
262 ..Default::default()
263 },
264 ),
265 (
266 "Balanced",
267 TrainingConfig {
268 learning_rate: 0.01,
269 weight_decay: 0.0,
270 beta1: 0.9,
271 ..Default::default()
272 },
273 ),
274 (
275 "Aggressive",
276 TrainingConfig {
277 learning_rate: 0.1,
278 weight_decay: 0.0,
279 beta1: 0.8,
280 ..Default::default()
281 },
282 ),
283 ];
284
285 let mut benchmark_results = Vec::new();
286
287 for (name, config) in configs {
288 println!("\nBenchmarking {} configuration:", name);
289
290 let start_time = std::time::Instant::now();
291 let stats = train_with_config(config.clone())?;
292 let elapsed = start_time.elapsed();
293
294 println!(" Training time: {:.2}ms", elapsed.as_millis());
295 println!(" Final loss: {:.6}", stats.final_loss);
296 println!(" Convergence: {} epochs", stats.convergence_epoch);
297
298 benchmark_results.push((name.to_string(), stats, elapsed));
299 }
300
301 // Summary
302 println!("\nBenchmarking Summary:");
303 for (name, stats, elapsed) in &benchmark_results {
304 println!(
305 " {:12}: Loss={:.6}, Time={:4}ms, Converged@{}",
306 name,
307 stats.final_loss,
308 elapsed.as_millis(),
309 stats.convergence_epoch
310 );
311 }
312
313 Ok(())
314}
315
316/// Helper function to train with specific configuration
317fn train_with_config(config: TrainingConfig) -> Result<TrainingStats, Box<dyn std::error::Error>> {
318 // Create training data
319 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
320 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
321
322 // Create model parameters
323 let mut weight = Tensor::randn(vec![1, 1], Some(123)).with_requires_grad();
324 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
325
326 // Create optimizer with custom configuration
327 let adam_config = AdamConfig {
328 learning_rate: config.learning_rate,
329 beta1: config.beta1,
330 beta2: config.beta2,
331 eps: 1e-8,
332 weight_decay: config.weight_decay,
333 amsgrad: false,
334 };
335
336 let mut optimizer = Adam::with_config(adam_config);
337 optimizer.add_parameter(&weight);
338 optimizer.add_parameter(&bias);
339
340 // Training loop
341 let mut losses = Vec::new();
342 let mut convergence_epoch = config.epochs;
343
344 for epoch in 0..config.epochs {
345 // Forward pass
346 let y_pred = x_data.matmul(&weight) + &bias;
347 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
348
349 // Backward pass
350 loss.backward(None);
351
352 // Optimizer step
353 optimizer.step(&mut [&mut weight, &mut bias]);
354 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
355
356 let loss_value = loss.value();
357 losses.push(loss_value);
358
359 // Check for convergence (loss < 0.01)
360 if loss_value < 0.01 && convergence_epoch == config.epochs {
361 convergence_epoch = epoch;
362 }
363 }
364
365 Ok(TrainingStats {
366 config,
367 final_loss: losses[losses.len() - 1],
368 loss_history: losses,
369 convergence_epoch,
370 weight_norm: weight.norm().value(),
371 })
372}319fn train_with_scheduler(
320 scheduler: &mut dyn LearningRateScheduler,
321 num_epochs: usize,
322) -> Result<TrainingStats, Box<dyn std::error::Error>> {
323 // Create training data: y = 2*x + 1
324 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
325 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
326
327 // Create model parameters
328 let mut weight = Tensor::randn(vec![1, 1], Some(456)).with_requires_grad();
329 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
330
331 // Create optimizer with initial learning rate
332 let mut optimizer = Adam::with_learning_rate(0.05);
333 optimizer.add_parameter(&weight);
334 optimizer.add_parameter(&bias);
335
336 // Training loop
337 let mut losses = Vec::new();
338 let mut lr_history = Vec::new();
339 let mut convergence_epoch = num_epochs;
340
341 for epoch in 0..num_epochs {
342 // Forward pass
343 let y_pred = x_data.matmul(&weight) + &bias;
344 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
345
346 // Backward pass
347 loss.backward(None);
348
349 // Update learning rate using scheduler
350 let current_lr = optimizer.learning_rate();
351 let new_lr = scheduler.step(current_lr, epoch, loss.value());
352
353 if (new_lr - current_lr).abs() > 1e-8 {
354 optimizer.set_learning_rate(new_lr);
355 }
356
357 // Optimizer step
358 optimizer.step(&mut [&mut weight, &mut bias]);
359 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
360
361 let loss_value = loss.value();
362 losses.push(loss_value);
363 lr_history.push(new_lr);
364
365 // Check for convergence
366 if loss_value < 0.01 && convergence_epoch == num_epochs {
367 convergence_epoch = epoch;
368 }
369 }
370
371 Ok(TrainingStats {
372 scheduler_name: scheduler.name().to_string(),
373 final_loss: losses[losses.len() - 1],
374 lr_history,
375 loss_history: losses,
376 convergence_epoch,
377 })
378}431fn demonstrate_training_workflow() -> Result<(), Box<dyn std::error::Error>> {
432 println!("\n--- Training Workflow ---");
433
434 // Create a simple classification network
435 let config = FeedForwardConfig {
436 input_size: 2,
437 hidden_sizes: vec![4, 3],
438 output_size: 1,
439 use_bias: true,
440 };
441 let mut network = FeedForwardNetwork::new(config, Some(46));
442
443 println!("Training network: 2 -> [4, 3] -> 1");
444
445 // Create simple binary classification data: XOR problem
446 let x_data = Tensor::from_slice(
447 &[
448 0.0, 0.0, // -> 0
449 0.0, 1.0, // -> 1
450 1.0, 0.0, // -> 1
451 1.0, 1.0, // -> 0
452 ],
453 vec![4, 2],
454 )
455 .unwrap();
456
457 let y_true = Tensor::from_slice(&[0.0, 1.0, 1.0, 0.0], vec![4, 1]).unwrap();
458
459 println!("Training on XOR problem:");
460 println!(" Input shape: {:?}", x_data.shape().dims());
461 println!(" Target shape: {:?}", y_true.shape().dims());
462
463 // Create optimizer
464 let mut optimizer = Adam::with_learning_rate(0.1);
465 let params = network.parameters();
466 for param in ¶ms {
467 optimizer.add_parameter(param);
468 }
469
470 // Training loop
471 let num_epochs = 50;
472 let mut losses = Vec::new();
473
474 for epoch in 0..num_epochs {
475 // Forward pass
476 let y_pred = network.forward(&x_data);
477
478 // Compute loss: MSE
479 let diff = y_pred.sub_tensor(&y_true);
480 let mut loss = diff.pow_scalar(2.0).mean();
481
482 // Backward pass
483 loss.backward(None);
484
485 // Optimizer step and zero grad
486 let mut params = network.parameters();
487 optimizer.step(&mut params);
488 optimizer.zero_grad(&mut params);
489
490 losses.push(loss.value());
491
492 // Print progress
493 if epoch % 10 == 0 || epoch == num_epochs - 1 {
494 println!("Epoch {:2}: Loss = {:.6}", epoch, loss.value());
495 }
496 }
497
498 // Test final model
499 let final_predictions = network.forward_no_grad(&x_data);
500 println!("\nFinal predictions vs targets:");
501 for i in 0..4 {
502 let pred = final_predictions.data()[i];
503 let target = y_true.data()[i];
504 let input_x = x_data.data()[i * 2];
505 let input_y = x_data.data()[i * 2 + 1];
506 println!(
507 " [{:.0}, {:.0}] -> pred: {:.3}, target: {:.0}, error: {:.3}",
508 input_x,
509 input_y,
510 pred,
511 target,
512 (pred - target).abs()
513 );
514 }
515
516 Ok(())
517}
518
519/// Demonstrate comprehensive training with 100+ steps
520fn demonstrate_comprehensive_training() -> Result<(), Box<dyn std::error::Error>> {
521 println!("\n--- Comprehensive Training (100+ Steps) ---");
522
523 // Create a regression network
524 let config = FeedForwardConfig {
525 input_size: 3,
526 hidden_sizes: vec![8, 6, 4],
527 output_size: 2,
528 use_bias: true,
529 };
530 let mut network = FeedForwardNetwork::new(config, Some(47));
531
532 println!("Network architecture: 3 -> [8, 6, 4] -> 2");
533 println!("Total parameters: {}", network.parameter_count());
534
535 // Create synthetic regression data
536 // Target function: [y1, y2] = [x1 + 2*x2 - x3, x1*x2 + x3]
537 let num_samples = 32;
538 let mut x_vec = Vec::new();
539 let mut y_vec = Vec::new();
540
541 for i in 0..num_samples {
542 let x1 = (i as f32 / num_samples as f32) * 2.0 - 1.0; // [-1, 1]
543 let x2 = ((i * 2) as f32 / num_samples as f32) * 2.0 - 1.0;
544 let x3 = ((i * 3) as f32 / num_samples as f32) * 2.0 - 1.0;
545
546 let y1 = x1 + 2.0 * x2 - x3;
547 let y2 = x1 * x2 + x3;
548
549 x_vec.extend_from_slice(&[x1, x2, x3]);
550 y_vec.extend_from_slice(&[y1, y2]);
551 }
552
553 let x_data = Tensor::from_slice(&x_vec, vec![num_samples, 3]).unwrap();
554 let y_true = Tensor::from_slice(&y_vec, vec![num_samples, 2]).unwrap();
555
556 println!("Training data:");
557 println!(" {} samples", num_samples);
558 println!(" Input shape: {:?}", x_data.shape().dims());
559 println!(" Target shape: {:?}", y_true.shape().dims());
560
561 // Create optimizer with learning rate scheduling
562 let mut optimizer = Adam::with_learning_rate(0.01);
563 let params = network.parameters();
564 for param in ¶ms {
565 optimizer.add_parameter(param);
566 }
567
568 // Comprehensive training loop (150 epochs)
569 let num_epochs = 150;
570 let mut losses = Vec::new();
571 let mut best_loss = f32::INFINITY;
572 let mut patience_counter = 0;
573 let patience = 20;
574
575 println!("Starting comprehensive training...");
576
577 for epoch in 0..num_epochs {
578 // Forward pass
579 let y_pred = network.forward(&x_data);
580
581 // Compute loss: MSE
582 let diff = y_pred.sub_tensor(&y_true);
583 let mut loss = diff.pow_scalar(2.0).mean();
584
585 // Backward pass
586 loss.backward(None);
587
588 // Optimizer step and zero grad
589 let mut params = network.parameters();
590 optimizer.step(&mut params);
591 optimizer.zero_grad(&mut params);
592
593 let current_loss = loss.value();
594 losses.push(current_loss);
595
596 // Learning rate scheduling
597 if epoch > 0 && epoch % 30 == 0 {
598 let new_lr = optimizer.learning_rate() * 0.8;
599 optimizer.set_learning_rate(new_lr);
600 println!(" Reduced learning rate to {:.4}", new_lr);
601 }
602
603 // Early stopping logic
604 if current_loss < best_loss {
605 best_loss = current_loss;
606 patience_counter = 0;
607 } else {
608 patience_counter += 1;
609 }
610
611 // Print progress
612 if epoch % 25 == 0 || epoch == num_epochs - 1 {
613 println!(
614 "Epoch {:3}: Loss = {:.6}, LR = {:.4}, Best = {:.6}",
615 epoch,
616 current_loss,
617 optimizer.learning_rate(),
618 best_loss
619 );
620 }
621
622 // Early stopping
623 if patience_counter >= patience && epoch > 50 {
624 println!("Early stopping at epoch {} (patience exceeded)", epoch);
625 break;
626 }
627 }
628
629 // Final evaluation
630 let final_predictions = network.forward_no_grad(&x_data);
631
632 // Compute final metrics
633 let final_loss = losses[losses.len() - 1];
634 let initial_loss = losses[0];
635 let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
636
637 println!("\nTraining completed!");
638 println!(" Initial loss: {:.6}", initial_loss);
639 println!(" Final loss: {:.6}", final_loss);
640 println!(" Best loss: {:.6}", best_loss);
641 println!(" Loss reduction: {:.1}%", loss_reduction);
642 println!(" Final learning rate: {:.4}", optimizer.learning_rate());
643
644 // Sample predictions analysis
645 println!("\nSample predictions (first 5):");
646 for i in 0..5.min(num_samples) {
647 let pred1 = final_predictions.data()[i * 2];
648 let pred2 = final_predictions.data()[i * 2 + 1];
649 let true1 = y_true.data()[i * 2];
650 let true2 = y_true.data()[i * 2 + 1];
651
652 println!(
653 " Sample {}: pred=[{:.3}, {:.3}], true=[{:.3}, {:.3}], error=[{:.3}, {:.3}]",
654 i + 1,
655 pred1,
656 pred2,
657 true1,
658 true2,
659 (pred1 - true1).abs(),
660 (pred2 - true2).abs()
661 );
662 }
663
664 Ok(())
665}
666
667/// Demonstrate network serialization
668fn demonstrate_network_serialization() -> Result<(), Box<dyn std::error::Error>> {
669 println!("\n--- Network Serialization ---");
670
671 // Create and train a network
672 let config = FeedForwardConfig {
673 input_size: 2,
674 hidden_sizes: vec![4, 2],
675 output_size: 1,
676 use_bias: true,
677 };
678 let mut original_network = FeedForwardNetwork::new(config.clone(), Some(48));
679
680 // Quick training
681 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
682 let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
683
684 let mut optimizer = Adam::with_learning_rate(0.01);
685 let params = original_network.parameters();
686 for param in ¶ms {
687 optimizer.add_parameter(param);
688 }
689
690 for _ in 0..20 {
691 let y_pred = original_network.forward(&x_data);
692 let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
693 loss.backward(None);
694
695 let mut params = original_network.parameters();
696 optimizer.step(&mut params);
697 optimizer.zero_grad(&mut params);
698 }
699
700 // Test original network
701 let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
702 let original_output = original_network.forward_no_grad(&test_input);
703
704 println!("Original network output: {:?}", original_output.data());
705
706 // Save network
707 original_network.save_json("temp_feedforward_network")?;
708
709 // Load network
710 let loaded_network = FeedForwardNetwork::load_json("temp_feedforward_network", config)?;
711 let loaded_output = loaded_network.forward_no_grad(&test_input);
712
713 println!("Loaded network output: {:?}", loaded_output.data());
714
715 // Verify consistency
716 let match_check = original_output
717 .data()
718 .iter()
719 .zip(loaded_output.data().iter())
720 .all(|(a, b)| (a - b).abs() < 1e-6);
721
722 println!(
723 "Serialization verification: {}",
724 if match_check { "PASSED" } else { "FAILED" }
725 );
726
727 Ok(())
728}218fn demonstrate_training_loop() -> Result<(), Box<dyn std::error::Error>> {
219 println!("\n--- Training Loop ---");
220
221 // Create layer and training data
222 let mut layer = LinearLayer::new(2, 1, Some(45));
223
224 // Simple regression task: y = 2*x1 + 3*x2 + 1
225 let x_data = Tensor::from_slice(
226 &[
227 1.0, 1.0, // x1=1, x2=1 -> y=6
228 2.0, 1.0, // x1=2, x2=1 -> y=8
229 1.0, 2.0, // x1=1, x2=2 -> y=9
230 2.0, 2.0, // x1=2, x2=2 -> y=11
231 ],
232 vec![4, 2],
233 )
234 .unwrap();
235
236 let y_true = Tensor::from_slice(&[6.0, 8.0, 9.0, 11.0], vec![4, 1]).unwrap();
237
238 println!("Training data:");
239 println!(" X shape: {:?}", x_data.shape().dims());
240 println!(" Y shape: {:?}", y_true.shape().dims());
241 println!(" Target function: y = 2*x1 + 3*x2 + 1");
242
243 // Create optimizer
244 let config = AdamConfig {
245 learning_rate: 0.01,
246 beta1: 0.9,
247 beta2: 0.999,
248 eps: 1e-8,
249 weight_decay: 0.0,
250 amsgrad: false,
251 };
252
253 let mut optimizer = Adam::with_config(config);
254 let params = layer.parameters();
255 for param in ¶ms {
256 optimizer.add_parameter(param);
257 }
258
259 println!("Optimizer setup complete. Starting training...");
260
261 // Training loop
262 let num_epochs = 100;
263 let mut losses = Vec::new();
264
265 for epoch in 0..num_epochs {
266 // Forward pass
267 let y_pred = layer.forward(&x_data);
268
269 // Compute loss: MSE
270 let diff = y_pred.sub_tensor(&y_true);
271 let mut loss = diff.pow_scalar(2.0).mean();
272
273 // Backward pass
274 loss.backward(None);
275
276 // Optimizer step
277 let mut params = layer.parameters();
278 optimizer.step(&mut params);
279 optimizer.zero_grad(&mut params);
280
281 losses.push(loss.value());
282
283 // Print progress
284 if epoch % 20 == 0 || epoch == num_epochs - 1 {
285 println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
286 }
287 }
288
289 // Evaluate final model
290 let final_predictions = layer.forward_no_grad(&x_data);
291
292 println!("\nFinal model evaluation:");
293 println!(" Learned weights: {:?}", layer.weight.data());
294 println!(" Learned bias: {:?}", layer.bias.data());
295 println!(" Target weights: [2.0, 3.0]");
296 println!(" Target bias: [1.0]");
297
298 println!(" Predictions vs True:");
299 for i in 0..4 {
300 let pred = final_predictions.data()[i];
301 let true_val = y_true.data()[i];
302 println!(
303 " Sample {}: pred={:.3}, true={:.1}, error={:.3}",
304 i + 1,
305 pred,
306 true_val,
307 (pred - true_val).abs()
308 );
309 }
310
311 // Training analysis
312 let initial_loss = losses[0];
313 let final_loss = losses[losses.len() - 1];
314 let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
315
316 println!("\nTraining Analysis:");
317 println!(" Initial loss: {:.6}", initial_loss);
318 println!(" Final loss: {:.6}", final_loss);
319 println!(" Loss reduction: {:.1}%", loss_reduction);
320
321 Ok(())
322}
323
324/// Demonstrate single vs batch inference
325fn demonstrate_single_vs_batch_inference() {
326 println!("\n--- Single vs Batch Inference ---");
327
328 let layer = LinearLayer::new(4, 3, Some(46));
329
330 // Single inference
331 println!("Single inference:");
332 let single_input = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![1, 4]).unwrap();
333 let single_output = layer.forward_no_grad(&single_input);
334 println!(" Input shape: {:?}", single_input.shape().dims());
335 println!(" Output shape: {:?}", single_output.shape().dims());
336 println!(" Output: {:?}", single_output.data());
337
338 // Batch inference
339 println!("Batch inference:");
340 let batch_input = Tensor::from_slice(
341 &[
342 1.0, 2.0, 3.0, 4.0, // Sample 1
343 5.0, 6.0, 7.0, 8.0, // Sample 2
344 9.0, 10.0, 11.0, 12.0, // Sample 3
345 ],
346 vec![3, 4],
347 )
348 .unwrap();
349 let batch_output = layer.forward_no_grad(&batch_input);
350 println!(" Input shape: {:?}", batch_input.shape().dims());
351 println!(" Output shape: {:?}", batch_output.shape().dims());
352
353 // Verify batch consistency - first sample should match single inference
354 let _first_batch_sample = batch_output.view(vec![3, 3]); // Reshape to access first sample
355 let first_sample_data = &batch_output.data()[0..3]; // First 3 elements
356 let single_sample_data = single_output.data();
357
358 println!("Consistency check:");
359 println!(" Single output: {:?}", single_sample_data);
360 println!(" First batch sample: {:?}", first_sample_data);
361 println!(
362 " Match: {}",
363 single_sample_data
364 .iter()
365 .zip(first_sample_data.iter())
366 .all(|(a, b)| (a - b).abs() < 1e-6)
367 );
368}
369
370/// Demonstrate serialization and loading
371fn demonstrate_serialization() -> Result<(), Box<dyn std::error::Error>> {
372 println!("\n--- Serialization ---");
373
374 // Create and train a simple layer
375 let mut original_layer = LinearLayer::new(2, 1, Some(47));
376
377 // Simple training data
378 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
379 let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
380
381 let mut optimizer = Adam::with_learning_rate(0.01);
382 let params = original_layer.parameters();
383 for param in ¶ms {
384 optimizer.add_parameter(param);
385 }
386
387 // Train for a few epochs
388 for _ in 0..10 {
389 let y_pred = original_layer.forward(&x_data);
390 let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
391 loss.backward(None);
392
393 let mut params = original_layer.parameters();
394 optimizer.step(&mut params);
395 optimizer.zero_grad(&mut params);
396 }
397
398 println!("Original layer trained");
399 println!(" Weight: {:?}", original_layer.weight.data());
400 println!(" Bias: {:?}", original_layer.bias.data());
401
402 // Save layer
403 original_layer.save_json("temp_linear_layer")?;
404
405 // Load layer
406 let loaded_layer = LinearLayer::load_json("temp_linear_layer", 2, 1)?;
407
408 println!("Loaded layer");
409 println!(" Weight: {:?}", loaded_layer.weight.data());
410 println!(" Bias: {:?}", loaded_layer.bias.data());
411
412 // Verify consistency
413 let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
414 let original_output = original_layer.forward_no_grad(&test_input);
415 let loaded_output = loaded_layer.forward_no_grad(&test_input);
416
417 println!("Consistency check:");
418 println!(" Original output: {:?}", original_output.data());
419 println!(" Loaded output: {:?}", loaded_output.data());
420 println!(
421 " Match: {}",
422 original_output
423 .data()
424 .iter()
425 .zip(loaded_output.data().iter())
426 .all(|(a, b)| (a - b).abs() < 1e-6)
427 );
428
429 println!("Serialization verification: PASSED");
430
431 Ok(())
432}Sourcepub fn add_parameters(&mut self, parameters: &[&Tensor])
pub fn add_parameters(&mut self, parameters: &[&Tensor])
Add multiple parameters to the optimizer
Links multiple parameters to the optimizer by creating parameter states
indexed by each tensor’s ID. All parameters must have requires_grad set to true.
§Arguments
parameters- Slice of references to tensors to link
§Panics
Panics if any parameter does not have requires_grad set to true
Sourcepub fn unlink_parameter(&mut self, parameter: &Tensor) -> bool
pub fn unlink_parameter(&mut self, parameter: &Tensor) -> bool
Sourcepub fn clear_states(&mut self)
pub fn clear_states(&mut self)
Remove all parameter states from the optimizer
Clears all parameter states, effectively unlinking all parameters. This is useful for resetting the optimizer or preparing for parameter re-linking.
Sourcepub fn is_parameter_linked(&self, parameter: &Tensor) -> bool
pub fn is_parameter_linked(&self, parameter: &Tensor) -> bool
Sourcepub fn parameter_count(&self) -> usize
pub fn parameter_count(&self) -> usize
Get the number of linked parameters
Returns the count of parameters currently linked to the optimizer.
§Returns
Number of linked parameters
Examples found in repository?
47fn demonstrate_basic_optimizer_setup() {
48 println!("--- Basic Optimizer Setup ---");
49
50 // Create parameters that require gradients
51 let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52 let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54 println!("Created parameters:");
55 println!(
56 " Weight: shape {:?}, requires_grad: {}",
57 weight.shape().dims(),
58 weight.requires_grad()
59 );
60 println!(
61 " Bias: shape {:?}, requires_grad: {}",
62 bias.shape().dims(),
63 bias.requires_grad()
64 );
65
66 // Create Adam optimizer with default configuration
67 let mut optimizer = Adam::new();
68 println!(
69 "Created Adam optimizer with learning rate: {}",
70 optimizer.learning_rate()
71 );
72
73 // Add parameters to optimizer
74 optimizer.add_parameter(&weight);
75 optimizer.add_parameter(&bias);
76 println!(
77 "Added {} parameters to optimizer",
78 optimizer.parameter_count()
79 );
80
81 // Create optimizer with custom configuration
82 let config = AdamConfig {
83 learning_rate: 0.01,
84 beta1: 0.9,
85 beta2: 0.999,
86 eps: 1e-8,
87 weight_decay: 0.0,
88 amsgrad: false,
89 };
90
91 let mut custom_optimizer = Adam::with_config(config);
92 custom_optimizer.add_parameter(&weight);
93 custom_optimizer.add_parameter(&bias);
94
95 println!(
96 "Created custom optimizer with learning rate: {}",
97 custom_optimizer.learning_rate()
98 );
99
100 // Demonstrate parameter linking
101 println!("Parameter linking completed successfully");
102}More examples
109fn demonstrate_optimizer_serialization() -> Result<(), Box<dyn std::error::Error>> {
110 println!("\n--- Optimizer Serialization ---");
111
112 // Create an optimizer with some parameters
113 let mut weight = Tensor::randn(vec![2, 2], Some(42)).with_requires_grad();
114 let mut bias = Tensor::randn(vec![2], Some(43)).with_requires_grad();
115
116 let config = AdamConfig {
117 learning_rate: 0.001,
118 beta1: 0.9,
119 beta2: 0.999,
120 eps: 1e-8,
121 weight_decay: 0.0,
122 amsgrad: false,
123 };
124
125 let mut optimizer = Adam::with_config(config);
126 optimizer.add_parameter(&weight);
127 optimizer.add_parameter(&bias);
128
129 println!(
130 "Created optimizer with {} parameters",
131 optimizer.parameter_count()
132 );
133 println!("Learning rate: {}", optimizer.learning_rate());
134
135 // Simulate some training steps
136 for _ in 0..3 {
137 let mut loss = weight.sum() + bias.sum();
138 loss.backward(None);
139 optimizer.step(&mut [&mut weight, &mut bias]);
140 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
141 }
142
143 // Save optimizer state
144 let optimizer_path = "temp_optimizer.json";
145 optimizer.save_json(optimizer_path)?;
146 println!("Saved optimizer to: {}", optimizer_path);
147
148 // Load optimizer state
149 let loaded_optimizer = Adam::load_json(optimizer_path)?;
150 println!(
151 "Loaded optimizer with {} parameters",
152 loaded_optimizer.parameter_count()
153 );
154 println!("Learning rate: {}", loaded_optimizer.learning_rate());
155
156 // Verify optimizer state
157 assert_eq!(
158 optimizer.parameter_count(),
159 loaded_optimizer.parameter_count()
160 );
161 assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
162 println!("Optimizer serialization verification: PASSED");
163
164 Ok(())
165}Sourcepub fn relink_parameters(
&mut self,
parameters: &[&Tensor],
) -> Result<(), String>
pub fn relink_parameters( &mut self, parameters: &[&Tensor], ) -> Result<(), String>
Re-link parameters to saved optimizer states in chronological order
After deserializing an optimizer, use this method to restore saved parameter states to new tensors. Parameters must be provided in the same chronological order they were originally added to the optimizer. Shape validation ensures parameter compatibility.
§Arguments
parameters- Slice of parameter references in chronological order
§Returns
Result indicating success or failure with detailed error message
§Panics
Panics if any parameter does not have requires_grad set to true
Sourcepub fn config(&self) -> &AdamConfig
pub fn config(&self) -> &AdamConfig
Get the current optimizer configuration
Returns a reference to the current configuration, allowing inspection of all hyperparameters without modification.
§Returns
Reference to the current Adam configuration
Trait Implementations§
Source§impl FromFieldValue for Adam
impl FromFieldValue for Adam
Source§fn from_field_value(
value: FieldValue,
field_name: &str,
) -> SerializationResult<Self>
fn from_field_value( value: FieldValue, field_name: &str, ) -> SerializationResult<Self>
Source§impl Optimizer for Adam
impl Optimizer for Adam
Source§fn step(&mut self, parameters: &mut [&mut Tensor])
fn step(&mut self, parameters: &mut [&mut Tensor])
Perform a single optimization step
Updates all provided parameters based on their accumulated gradients using the Adam algorithm. Each parameter is updated according to the Adam update rule with bias correction and optional AMSGrad variant if enabled. All parameters must be linked to the optimizer before calling this method.
§Arguments
parameters- Mutable slice of parameter references to update
§Thread Safety
This method is thread-safe as it takes mutable references to parameters, ensuring exclusive access during updates.
§Performance
- Uses SIMD optimization (AVX2) when available for 8x vectorization
- Processes parameters in sequence for optimal cache usage
- Maintains per-parameter state for momentum and velocity estimates
§Panics
Panics if any parameter is not linked to the optimizer
Source§fn zero_grad(&mut self, parameters: &mut [&mut Tensor])
fn zero_grad(&mut self, parameters: &mut [&mut Tensor])
Zero out all parameter gradients
Clears accumulated gradients for all provided parameters. This should be called before each backward pass to prevent gradient accumulation across multiple forward/backward passes. Also clears the global autograd gradient map.
§Arguments
parameters- Mutable slice of parameter references to clear gradients for
§Performance
- Efficiently clears gradients using optimized tensor operations
- Clears both per-tensor gradients and global autograd state
- Thread-safe as it takes mutable references to parameters
Source§fn learning_rate(&self) -> f32
fn learning_rate(&self) -> f32
Get the current learning rate
Returns the current learning rate used for parameter updates.
§Returns
Current learning rate as f32
Source§fn set_learning_rate(&mut self, lr: f32)
fn set_learning_rate(&mut self, lr: f32)
Set the learning rate for all parameters
Updates the learning rate for all parameters in the optimizer. This allows dynamic learning rate scheduling during training.
§Arguments
lr- New learning rate value
Source§impl Serializable for Adam
impl Serializable for Adam
Source§fn to_json(&self) -> SerializationResult<String>
fn to_json(&self) -> SerializationResult<String>
Serialize the Adam optimizer to JSON format
This method converts the Adam optimizer into a human-readable JSON string representation that includes all optimizer state, configuration, parameter states, and step counts. The JSON format is suitable for debugging, configuration files, and cross-language interoperability.
§Returns
JSON string representation of the optimizer on success, or SerializationError on failure
§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;
let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);
let json = optimizer.to_json().unwrap();
assert!(!json.is_empty());Source§fn from_json(json: &str) -> SerializationResult<Self>
fn from_json(json: &str) -> SerializationResult<Self>
Deserialize an Adam optimizer from JSON format
This method parses a JSON string and reconstructs an Adam optimizer with all
saved state. Parameters must be re-linked after deserialization using
add_parameter or relink_parameters.
§Arguments
json- JSON string containing serialized optimizer
§Returns
The deserialized optimizer on success, or SerializationError on failure
§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;
let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);
let json = optimizer.to_json().unwrap();
let loaded_optimizer = Adam::from_json(&json).unwrap();
assert_eq!(loaded_optimizer.saved_parameter_count(), 1);Source§fn to_binary(&self) -> SerializationResult<Vec<u8>>
fn to_binary(&self) -> SerializationResult<Vec<u8>>
Serialize the Adam optimizer to binary format
This method converts the optimizer into a compact binary representation optimized for storage and transmission. The binary format provides maximum performance and minimal file sizes compared to JSON.
§Returns
Binary representation of the optimizer on success, or SerializationError on failure
§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;
let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);
let binary = optimizer.to_binary().unwrap();
assert!(!binary.is_empty());Source§fn from_binary(data: &[u8]) -> SerializationResult<Self>
fn from_binary(data: &[u8]) -> SerializationResult<Self>
Deserialize an Adam optimizer from binary format
This method parses binary data and reconstructs an Adam optimizer with all
saved state. Parameters must be re-linked after deserialization using
add_parameter or relink_parameters.
§Arguments
data- Binary data containing serialized optimizer
§Returns
The deserialized optimizer on success, or SerializationError on failure
§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;
let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);
let binary = optimizer.to_binary().unwrap();
let loaded_optimizer = Adam::from_binary(&binary).unwrap();
assert_eq!(loaded_optimizer.saved_parameter_count(), 1);Source§fn save<P: AsRef<Path>>(
&self,
path: P,
format: Format,
) -> SerializationResult<()>
fn save<P: AsRef<Path>>( &self, path: P, format: Format, ) -> SerializationResult<()>
Source§fn save_to_writer<W: Write>(
&self,
writer: &mut W,
format: Format,
) -> SerializationResult<()>
fn save_to_writer<W: Write>( &self, writer: &mut W, format: Format, ) -> SerializationResult<()>
Source§fn load<P: AsRef<Path>>(path: P, format: Format) -> SerializationResult<Self>
fn load<P: AsRef<Path>>(path: P, format: Format) -> SerializationResult<Self>
Source§fn load_from_reader<R: Read>(
reader: &mut R,
format: Format,
) -> SerializationResult<Self>
fn load_from_reader<R: Read>( reader: &mut R, format: Format, ) -> SerializationResult<Self>
Source§impl StructSerializable for Adam
impl StructSerializable for Adam
Source§fn to_serializer(&self) -> StructSerializer
fn to_serializer(&self) -> StructSerializer
Convert Adam to StructSerializer for serialization
Serializes all optimizer state including configuration, parameter states, and global step count. Parameter linking is not serialized and must be done after deserialization.
§Returns
StructSerializer containing all serializable optimizer state
Source§fn from_deserializer(
deserializer: &mut StructDeserializer,
) -> SerializationResult<Self>
fn from_deserializer( deserializer: &mut StructDeserializer, ) -> SerializationResult<Self>
Create Adam from StructDeserializer
Reconstructs Adam optimizer from serialized state. Parameters must be
linked separately using add_parameter or add_parameters.
§Arguments
deserializer- StructDeserializer containing optimizer data
§Returns
Reconstructed Adam instance without parameter links, or error if deserialization fails