pub struct Adam { /* private fields */ }Expand description
Adam optimizer for neural network parameter optimization
Implements the Adam optimization algorithm with PyTorch-compatible interface. Provides adaptive learning rates with momentum for efficient training of neural networks. The optimizer maintains per-parameter state for momentum and velocity estimates, enabling adaptive learning rates that improve convergence across diverse architectures.
§Usage Pattern
The optimizer uses ID-based parameter linking for maximum flexibility and thread safety:
- Parameters are linked to the optimizer via
add_parameteroradd_parameters - The
stepmethod takes mutable references to parameters for thread-safe updates - Parameter states are maintained by tensor ID, allowing for dynamic parameter management
- Supports serialization and deserialization with parameter re-linking
§Dynamic Parameter Management
Parameters can be added, removed, or re-linked at runtime:
add_parameter: Link a single parameteradd_parameters: Link multiple parameters at onceunlink_parameter: Remove parameter state by IDclear_states: Remove all parameter statesis_parameter_linked: Check if a parameter is linked
§Serialization Support
The optimizer supports full serialization and deserialization with state preservation:
- Parameter states are saved with their shapes and insertion order for validation
- After deserialization, use
relink_parametersto restore saved states to new tensors - Parameters must be re-linked in the same chronological order they were originally added
- Shape validation ensures consistency between saved and current parameters
§Features
- ID-Based Parameter Linking: Dynamic parameter management via tensor IDs
- Thread-Safe Step Method: Takes mutable references for safe concurrent access
- Per-Parameter State: Each parameter maintains its own momentum and velocity buffers
- Bias Correction: Automatically corrects initialization bias in moment estimates
- Weight Decay: Optional L2 regularization with efficient implementation
- AMSGrad Support: Optional AMSGrad variant for improved convergence stability
- SIMD Optimization: AVX2-optimized updates for maximum performance
- Full Serialization: Complete state persistence and restoration
§Thread Safety
This type is thread-safe and can be shared between threads. The step method takes mutable references to parameters, ensuring exclusive access during updates.
Implementations§
Source§impl Adam
impl Adam
Sourcepub fn saved_parameter_count(&self) -> usize
pub fn saved_parameter_count(&self) -> usize
Get the number of saved parameter states for checkpoint validation
This method returns the count of parameter states currently stored in the optimizer, which is essential for validating checkpoint integrity and ensuring proper parameter re-linking after deserialization. The count includes all parameters that have been linked to the optimizer and have accumulated optimization state.
§Returns
Number of parameter states currently stored in the optimizer
§Usage Patterns
§Checkpoint Validation
After deserializing an optimizer, this method helps verify that the expected number of parameters were saved and can guide the re-linking process.
§Training Resumption
When resuming training, compare this count with the number of parameters in your model to ensure checkpoint compatibility.
§State Management
Use this method to monitor optimizer state growth and memory usage during training with dynamic parameter addition.
§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;
let weight = Tensor::ones(vec![10, 5]).with_requires_grad();
let bias = Tensor::zeros(vec![5]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);
optimizer.add_parameter(&bias);
// Check parameter count before serialization
assert_eq!(optimizer.saved_parameter_count(), 2);
// Serialize and deserialize
let json = optimizer.to_json().unwrap();
let loaded_optimizer = Adam::from_json(&json).unwrap();
// Verify parameter count is preserved
assert_eq!(loaded_optimizer.saved_parameter_count(), 2);§Performance
- Time Complexity: O(1) - Direct access to internal state count
- Memory Usage: No additional memory allocation
- Thread Safety: Safe to call from multiple threads concurrently
Source§impl Adam
impl Adam
Sourcepub fn new() -> Self
pub fn new() -> Self
Create a new Adam optimizer with default configuration
Initializes an Adam optimizer with PyTorch-compatible default hyperparameters.
Parameters must be linked separately using add_parameter or add_parameters.
§Returns
A new Adam optimizer instance with default hyperparameters
Examples found in repository?
47fn demonstrate_basic_optimizer_setup() {
48 println!("--- Basic Optimizer Setup ---");
49
50 // Create parameters that require gradients
51 let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52 let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54 println!("Created parameters:");
55 println!(
56 " Weight: shape {:?}, requires_grad: {}",
57 weight.shape().dims,
58 weight.requires_grad()
59 );
60 println!(
61 " Bias: shape {:?}, requires_grad: {}",
62 bias.shape().dims,
63 bias.requires_grad()
64 );
65
66 // Create Adam optimizer with default configuration
67 let mut optimizer = Adam::new();
68 println!(
69 "Created Adam optimizer with learning rate: {}",
70 optimizer.learning_rate()
71 );
72
73 // Add parameters to optimizer
74 optimizer.add_parameter(&weight);
75 optimizer.add_parameter(&bias);
76 println!(
77 "Added {} parameters to optimizer",
78 optimizer.parameter_count()
79 );
80
81 // Create optimizer with custom configuration
82 let config = AdamConfig {
83 learning_rate: 0.01,
84 beta1: 0.9,
85 beta2: 0.999,
86 eps: 1e-8,
87 weight_decay: 0.0,
88 amsgrad: false,
89 };
90
91 let mut custom_optimizer = Adam::with_config(config);
92 custom_optimizer.add_parameter(&weight);
93 custom_optimizer.add_parameter(&bias);
94
95 println!(
96 "Created custom optimizer with learning rate: {}",
97 custom_optimizer.learning_rate()
98 );
99
100 // Demonstrate parameter linking
101 println!("Parameter linking completed successfully");
102}More examples
84fn demonstrate_default_adam() -> Result<(), Box<dyn std::error::Error>> {
85 println!("--- Default Adam Configuration ---");
86
87 // Create a simple regression problem: y = 2*x + 1
88 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
89 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
90
91 // Create model parameters
92 let mut weight = Tensor::randn(vec![1, 1], Some(42)).with_requires_grad();
93 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
94
95 // Create Adam optimizer with default configuration
96 let mut optimizer = Adam::new();
97 optimizer.add_parameter(&weight);
98 optimizer.add_parameter(&bias);
99
100 println!("Default Adam configuration:");
101 println!(" Learning rate: {}", optimizer.learning_rate());
102 println!(" Initial weight: {:.6}", weight.value());
103 println!(" Initial bias: {:.6}", bias.value());
104
105 // Training loop
106 let num_epochs = 50;
107 let mut losses = Vec::new();
108
109 for epoch in 0..num_epochs {
110 // Forward pass
111 let y_pred = x_data.matmul(&weight) + &bias;
112 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
113
114 // Backward pass
115 loss.backward(None);
116
117 // Optimizer step
118 optimizer.step(&mut [&mut weight, &mut bias]);
119 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
120
121 losses.push(loss.value());
122
123 if epoch % 10 == 0 || epoch == num_epochs - 1 {
124 println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
125 }
126 }
127
128 // Evaluate final model
129 let _final_predictions = x_data.matmul(&weight) + &bias;
130 println!("\nFinal model:");
131 println!(" Learned weight: {:.6} (target: 2.0)", weight.value());
132 println!(" Learned bias: {:.6} (target: 1.0)", bias.value());
133 println!(" Final loss: {:.6}", losses[losses.len() - 1]);
134
135 Ok(())
136}Sourcepub fn with_config(config: AdamConfig) -> Self
pub fn with_config(config: AdamConfig) -> Self
Create a new Adam optimizer with custom configuration
Allows full control over all Adam hyperparameters for specialized training
scenarios such as fine-tuning, transfer learning, or research applications.
Parameters must be linked separately using add_parameter or add_parameters.
§Arguments
config- Adam configuration with custom hyperparameters
§Returns
A new Adam optimizer instance with the specified configuration
Examples found in repository?
47fn demonstrate_basic_optimizer_setup() {
48 println!("--- Basic Optimizer Setup ---");
49
50 // Create parameters that require gradients
51 let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52 let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54 println!("Created parameters:");
55 println!(
56 " Weight: shape {:?}, requires_grad: {}",
57 weight.shape().dims,
58 weight.requires_grad()
59 );
60 println!(
61 " Bias: shape {:?}, requires_grad: {}",
62 bias.shape().dims,
63 bias.requires_grad()
64 );
65
66 // Create Adam optimizer with default configuration
67 let mut optimizer = Adam::new();
68 println!(
69 "Created Adam optimizer with learning rate: {}",
70 optimizer.learning_rate()
71 );
72
73 // Add parameters to optimizer
74 optimizer.add_parameter(&weight);
75 optimizer.add_parameter(&bias);
76 println!(
77 "Added {} parameters to optimizer",
78 optimizer.parameter_count()
79 );
80
81 // Create optimizer with custom configuration
82 let config = AdamConfig {
83 learning_rate: 0.01,
84 beta1: 0.9,
85 beta2: 0.999,
86 eps: 1e-8,
87 weight_decay: 0.0,
88 amsgrad: false,
89 };
90
91 let mut custom_optimizer = Adam::with_config(config);
92 custom_optimizer.add_parameter(&weight);
93 custom_optimizer.add_parameter(&bias);
94
95 println!(
96 "Created custom optimizer with learning rate: {}",
97 custom_optimizer.learning_rate()
98 );
99
100 // Demonstrate parameter linking
101 println!("Parameter linking completed successfully");
102}More examples
109fn demonstrate_optimizer_serialization() -> Result<(), Box<dyn std::error::Error>> {
110 println!("\n--- Optimizer Serialization ---");
111
112 // Create an optimizer with some parameters
113 let mut weight = Tensor::randn(vec![2, 2], Some(42)).with_requires_grad();
114 let mut bias = Tensor::randn(vec![2], Some(43)).with_requires_grad();
115
116 let config = AdamConfig {
117 learning_rate: 0.001,
118 beta1: 0.9,
119 beta2: 0.999,
120 eps: 1e-8,
121 weight_decay: 0.0,
122 amsgrad: false,
123 };
124
125 let mut optimizer = Adam::with_config(config);
126 optimizer.add_parameter(&weight);
127 optimizer.add_parameter(&bias);
128
129 println!(
130 "Created optimizer with {} parameters",
131 optimizer.parameter_count()
132 );
133 println!("Learning rate: {}", optimizer.learning_rate());
134
135 // Simulate some training steps
136 for _ in 0..3 {
137 let mut loss = weight.sum() + bias.sum();
138 loss.backward(None);
139 optimizer.step(&mut [&mut weight, &mut bias]);
140 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
141 }
142
143 // Save optimizer state
144 let optimizer_path = "temp_optimizer.json";
145 optimizer.save_json(optimizer_path)?;
146 println!("Saved optimizer to: {}", optimizer_path);
147
148 // Load optimizer state
149 let loaded_optimizer = Adam::load_json(optimizer_path)?;
150 println!(
151 "Loaded optimizer with {} parameters",
152 loaded_optimizer.parameter_count()
153 );
154 println!("Learning rate: {}", loaded_optimizer.learning_rate());
155
156 // Verify optimizer state
157 assert_eq!(
158 optimizer.parameter_count(),
159 loaded_optimizer.parameter_count()
160 );
161 assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
162 println!("Optimizer serialization verification: PASSED");
163
164 Ok(())
165}317fn train_with_config(config: TrainingConfig) -> Result<TrainingStats, Box<dyn std::error::Error>> {
318 // Create training data
319 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
320 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
321
322 // Create model parameters
323 let mut weight = Tensor::randn(vec![1, 1], Some(123)).with_requires_grad();
324 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
325
326 // Create optimizer with custom configuration
327 let adam_config = AdamConfig {
328 learning_rate: config.learning_rate,
329 beta1: config.beta1,
330 beta2: config.beta2,
331 eps: 1e-8,
332 weight_decay: config.weight_decay,
333 amsgrad: false,
334 };
335
336 let mut optimizer = Adam::with_config(adam_config);
337 optimizer.add_parameter(&weight);
338 optimizer.add_parameter(&bias);
339
340 // Training loop
341 let mut losses = Vec::new();
342 let mut convergence_epoch = config.epochs;
343
344 for epoch in 0..config.epochs {
345 // Forward pass
346 let y_pred = x_data.matmul(&weight) + &bias;
347 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
348
349 // Backward pass
350 loss.backward(None);
351
352 // Optimizer step
353 optimizer.step(&mut [&mut weight, &mut bias]);
354 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
355
356 let loss_value = loss.value();
357 losses.push(loss_value);
358
359 // Check for convergence (loss < 0.01)
360 if loss_value < 0.01 && convergence_epoch == config.epochs {
361 convergence_epoch = epoch;
362 }
363 }
364
365 Ok(TrainingStats {
366 config,
367 final_loss: losses[losses.len() - 1],
368 loss_history: losses,
369 convergence_epoch,
370 weight_norm: weight.norm().value(),
371 })
372}217fn demonstrate_training_loop() -> Result<(), Box<dyn std::error::Error>> {
218 println!("\n--- Training Loop ---");
219
220 // Create layer and training data
221 let mut layer = LinearLayer::new(2, 1, Some(45));
222
223 // Simple regression task: y = 2*x1 + 3*x2 + 1
224 let x_data = Tensor::from_slice(
225 &[
226 1.0, 1.0, // x1=1, x2=1 -> y=6
227 2.0, 1.0, // x1=2, x2=1 -> y=8
228 1.0, 2.0, // x1=1, x2=2 -> y=9
229 2.0, 2.0, // x1=2, x2=2 -> y=11
230 ],
231 vec![4, 2],
232 )
233 .unwrap();
234
235 let y_true = Tensor::from_slice(&[6.0, 8.0, 9.0, 11.0], vec![4, 1]).unwrap();
236
237 println!("Training data:");
238 println!(" X shape: {:?}", x_data.shape().dims);
239 println!(" Y shape: {:?}", y_true.shape().dims);
240 println!(" Target function: y = 2*x1 + 3*x2 + 1");
241
242 // Create optimizer
243 let config = AdamConfig {
244 learning_rate: 0.01,
245 beta1: 0.9,
246 beta2: 0.999,
247 eps: 1e-8,
248 weight_decay: 0.0,
249 amsgrad: false,
250 };
251
252 let mut optimizer = Adam::with_config(config);
253 let params = layer.parameters();
254 for param in ¶ms {
255 optimizer.add_parameter(param);
256 }
257
258 println!("Optimizer setup complete. Starting training...");
259
260 // Training loop
261 let num_epochs = 100;
262 let mut losses = Vec::new();
263
264 for epoch in 0..num_epochs {
265 // Forward pass
266 let y_pred = layer.forward(&x_data);
267
268 // Compute loss: MSE
269 let diff = y_pred.sub_tensor(&y_true);
270 let mut loss = diff.pow_scalar(2.0).mean();
271
272 // Backward pass
273 loss.backward(None);
274
275 // Optimizer step
276 let mut params = layer.parameters();
277 optimizer.step(&mut params);
278 optimizer.zero_grad(&mut params);
279
280 losses.push(loss.value());
281
282 // Print progress
283 if epoch % 20 == 0 || epoch == num_epochs - 1 {
284 println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
285 }
286 }
287
288 // Evaluate final model
289 let final_predictions = layer.forward_no_grad(&x_data);
290
291 println!("\nFinal model evaluation:");
292 println!(" Learned weights: {:?}", layer.weight.data());
293 println!(" Learned bias: {:?}", layer.bias.data());
294 println!(" Target weights: [2.0, 3.0]");
295 println!(" Target bias: [1.0]");
296
297 println!(" Predictions vs True:");
298 for i in 0..4 {
299 let pred = final_predictions.data()[i];
300 let true_val = y_true.data()[i];
301 println!(
302 " Sample {}: pred={:.3}, true={:.1}, error={:.3}",
303 i + 1,
304 pred,
305 true_val,
306 (pred - true_val).abs()
307 );
308 }
309
310 // Training analysis
311 let initial_loss = losses[0];
312 let final_loss = losses[losses.len() - 1];
313 let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
314
315 println!("\nTraining Analysis:");
316 println!(" Initial loss: {:.6}", initial_loss);
317 println!(" Final loss: {:.6}", final_loss);
318 println!(" Loss reduction: {:.1}%", loss_reduction);
319
320 Ok(())
321}Sourcepub fn with_learning_rate(learning_rate: f32) -> Self
pub fn with_learning_rate(learning_rate: f32) -> Self
Create a new Adam optimizer with custom learning rate
A convenience constructor that allows setting only the learning rate while
using default values for all other hyperparameters. Parameters must be
linked separately using add_parameter or add_parameters.
§Arguments
learning_rate- Learning rate for optimization
§Returns
A new Adam optimizer instance with the specified learning rate and default values for all other hyperparameters
Examples found in repository?
319fn train_with_scheduler(
320 scheduler: &mut dyn LearningRateScheduler,
321 num_epochs: usize,
322) -> Result<TrainingStats, Box<dyn std::error::Error>> {
323 // Create training data: y = 2*x + 1
324 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
325 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
326
327 // Create model parameters
328 let mut weight = Tensor::randn(vec![1, 1], Some(456)).with_requires_grad();
329 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
330
331 // Create optimizer with initial learning rate
332 let mut optimizer = Adam::with_learning_rate(0.05);
333 optimizer.add_parameter(&weight);
334 optimizer.add_parameter(&bias);
335
336 // Training loop
337 let mut losses = Vec::new();
338 let mut lr_history = Vec::new();
339 let mut convergence_epoch = num_epochs;
340
341 for epoch in 0..num_epochs {
342 // Forward pass
343 let y_pred = x_data.matmul(&weight) + &bias;
344 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
345
346 // Backward pass
347 loss.backward(None);
348
349 // Update learning rate using scheduler
350 let current_lr = optimizer.learning_rate();
351 let new_lr = scheduler.step(current_lr, epoch, loss.value());
352
353 if (new_lr - current_lr).abs() > 1e-8 {
354 optimizer.set_learning_rate(new_lr);
355 }
356
357 // Optimizer step
358 optimizer.step(&mut [&mut weight, &mut bias]);
359 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
360
361 let loss_value = loss.value();
362 losses.push(loss_value);
363 lr_history.push(new_lr);
364
365 // Check for convergence
366 if loss_value < 0.01 && convergence_epoch == num_epochs {
367 convergence_epoch = epoch;
368 }
369 }
370
371 Ok(TrainingStats {
372 scheduler_name: scheduler.name().to_string(),
373 final_loss: losses[losses.len() - 1],
374 lr_history,
375 loss_history: losses,
376 convergence_epoch,
377 })
378}More examples
370fn demonstrate_serialization() -> Result<(), Box<dyn std::error::Error>> {
371 println!("\n--- Serialization ---");
372
373 // Create and train a simple layer
374 let mut original_layer = LinearLayer::new(2, 1, Some(47));
375
376 // Simple training data
377 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
378 let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
379
380 let mut optimizer = Adam::with_learning_rate(0.01);
381 let params = original_layer.parameters();
382 for param in ¶ms {
383 optimizer.add_parameter(param);
384 }
385
386 // Train for a few epochs
387 for _ in 0..10 {
388 let y_pred = original_layer.forward(&x_data);
389 let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
390 loss.backward(None);
391
392 let mut params = original_layer.parameters();
393 optimizer.step(&mut params);
394 optimizer.zero_grad(&mut params);
395 }
396
397 println!("Original layer trained");
398 println!(" Weight: {:?}", original_layer.weight.data());
399 println!(" Bias: {:?}", original_layer.bias.data());
400
401 // Save layer
402 original_layer.save_json("temp_linear_layer")?;
403
404 // Load layer
405 let loaded_layer = LinearLayer::load_json("temp_linear_layer", 2, 1)?;
406
407 println!("Loaded layer");
408 println!(" Weight: {:?}", loaded_layer.weight.data());
409 println!(" Bias: {:?}", loaded_layer.bias.data());
410
411 // Verify consistency
412 let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
413 let original_output = original_layer.forward_no_grad(&test_input);
414 let loaded_output = loaded_layer.forward_no_grad(&test_input);
415
416 println!("Consistency check:");
417 println!(" Original output: {:?}", original_output.data());
418 println!(" Loaded output: {:?}", loaded_output.data());
419 println!(
420 " Match: {}",
421 original_output
422 .data()
423 .iter()
424 .zip(loaded_output.data().iter())
425 .all(|(a, b)| (a - b).abs() < 1e-6)
426 );
427
428 println!("Serialization verification: PASSED");
429
430 Ok(())
431}204fn demonstrate_model_checkpointing() -> Result<(), Box<dyn std::error::Error>> {
205 println!("\n--- Model Checkpointing ---");
206
207 // Create a simple model (weights and bias)
208 let mut weights = Tensor::randn(vec![2, 1], Some(45)).with_requires_grad();
209 let mut bias = Tensor::randn(vec![1], Some(46)).with_requires_grad();
210
211 // Create optimizer
212 let mut optimizer = Adam::with_learning_rate(0.01);
213 optimizer.add_parameter(&weights);
214 optimizer.add_parameter(&bias);
215
216 println!("Initial weights: {:?}", weights.data());
217 println!("Initial bias: {:?}", bias.data());
218
219 // Simulate training
220 for epoch in 0..5 {
221 let mut loss = weights.sum() + bias.sum();
222 loss.backward(None);
223 optimizer.step(&mut [&mut weights, &mut bias]);
224 optimizer.zero_grad(&mut [&mut weights, &mut bias]);
225
226 if epoch % 2 == 0 {
227 // Save checkpoint
228 let checkpoint_dir = format!("checkpoint_epoch_{}", epoch);
229 fs::create_dir_all(&checkpoint_dir)?;
230
231 weights.save_json(format!("{}/weights.json", checkpoint_dir))?;
232 bias.save_json(format!("{}/bias.json", checkpoint_dir))?;
233 optimizer.save_json(format!("{}/optimizer.json", checkpoint_dir))?;
234
235 println!("Saved checkpoint for epoch {}", epoch);
236 }
237 }
238
239 // Load from checkpoint
240 let loaded_weights = Tensor::load_json("checkpoint_epoch_4/weights.json")?;
241 let loaded_bias = Tensor::load_json("checkpoint_epoch_4/bias.json")?;
242 let loaded_optimizer = Adam::load_json("checkpoint_epoch_4/optimizer.json")?;
243
244 println!("Loaded weights: {:?}", loaded_weights.data());
245 println!("Loaded bias: {:?}", loaded_bias.data());
246 println!(
247 "Loaded optimizer learning rate: {}",
248 loaded_optimizer.learning_rate()
249 );
250
251 // Verify checkpoint integrity
252 assert_eq!(weights.shape().dims, loaded_weights.shape().dims);
253 assert_eq!(bias.shape().dims, loaded_bias.shape().dims);
254 assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
255
256 println!("Checkpointing verification: PASSED");
257
258 Ok(())
259}105fn demonstrate_linear_regression() -> Result<(), Box<dyn std::error::Error>> {
106 println!("\n--- Linear Regression Training ---");
107
108 // Create model parameters
109 let mut weight = Tensor::randn(vec![1, 1], Some(43)).with_requires_grad();
110 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
111
112 // Create optimizer
113 let mut optimizer = Adam::with_learning_rate(0.01);
114 optimizer.add_parameter(&weight);
115 optimizer.add_parameter(&bias);
116
117 // Create simple training data: y = 2*x + 1
118 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
119 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
120
121 println!("Training data:");
122 println!(" X: {:?}", x_data.data());
123 println!(" Y: {:?}", y_true.data());
124 println!(" Target: y = 2*x + 1");
125
126 // Training loop
127 let num_epochs = 100;
128 let mut losses = Vec::new();
129
130 for epoch in 0..num_epochs {
131 // Forward pass: y_pred = x * weight + bias
132 let y_pred = x_data.matmul(&weight) + &bias;
133
134 // Compute loss: MSE
135 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
136
137 // Backward pass
138 loss.backward(None);
139
140 // Optimizer step
141 optimizer.step(&mut [&mut weight, &mut bias]);
142 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
143
144 losses.push(loss.value());
145
146 // Print progress every 20 epochs
147 if epoch % 20 == 0 || epoch == num_epochs - 1 {
148 println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
149 }
150 }
151
152 // Evaluate final model
153 let final_predictions = x_data.matmul(&weight) + &bias;
154 println!("\nFinal model evaluation:");
155 println!(" Learned weight: {:.6}", weight.value());
156 println!(" Learned bias: {:.6}", bias.value());
157 println!(" Predictions vs True:");
158
159 for i in 0..5 {
160 let x1 = x_data.data()[i];
161 let pred = final_predictions.data()[i];
162 let true_val = y_true.data()[i];
163 println!(
164 " x={:.1}: pred={:.3}, true={:.1}, error={:.3}",
165 x1,
166 pred,
167 true_val,
168 (pred - true_val).abs()
169 );
170 }
171
172 Ok(())
173}
174
175/// Demonstrate advanced training patterns
176fn demonstrate_advanced_training() -> Result<(), Box<dyn std::error::Error>> {
177 println!("\n--- Advanced Training Patterns ---");
178
179 // Create a more complex model
180 let mut weight = Tensor::randn(vec![1, 2], Some(44)).with_requires_grad();
181 let mut bias = Tensor::zeros(vec![2]).with_requires_grad();
182
183 // Create optimizer with different learning rate
184 let mut optimizer = Adam::with_learning_rate(0.005);
185 optimizer.add_parameter(&weight);
186 optimizer.add_parameter(&bias);
187
188 // Create training data: y = 2*x + [1, 3]
189 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
190 let y_true = Tensor::from_slice(
191 &[3.0, 5.0, 7.0, 9.0, 11.0, 6.0, 8.0, 10.0, 12.0, 14.0],
192 vec![5, 2],
193 )
194 .unwrap();
195
196 println!("Advanced training with monitoring:");
197 println!(" Initial learning rate: {}", optimizer.learning_rate());
198
199 // Training loop with monitoring
200 let num_epochs = 50;
201 let mut losses = Vec::new();
202 let mut weight_norms = Vec::new();
203 let mut gradient_norms = Vec::new();
204
205 for epoch in 0..num_epochs {
206 // Forward pass
207 let y_pred = x_data.matmul(&weight) + &bias;
208 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
209
210 // Backward pass
211 loss.backward(None);
212
213 // Compute gradient norm before optimizer step
214 let gradient_norm = weight.grad_by_value().unwrap().norm();
215
216 // Optimizer step
217 optimizer.step(&mut [&mut weight, &mut bias]);
218 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
219
220 // Learning rate scheduling: reduce every 10 epochs
221 if epoch > 0 && epoch % 10 == 0 {
222 let current_lr = optimizer.learning_rate();
223 let new_lr = current_lr * 0.5;
224 optimizer.set_learning_rate(new_lr);
225 println!(
226 "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
227 epoch, current_lr, new_lr
228 );
229 }
230
231 // Record metrics
232 losses.push(loss.value());
233 weight_norms.push(weight.norm().value());
234 gradient_norms.push(gradient_norm.value());
235
236 // Print detailed progress
237 if epoch % 10 == 0 || epoch == num_epochs - 1 {
238 println!(
239 "Epoch {:2}: Loss = {:.6}, Weight Norm = {:.6}, Gradient Norm = {:.6}",
240 epoch,
241 loss.value(),
242 weight.norm().value(),
243 gradient_norm.value()
244 );
245 }
246 }
247
248 println!("Final learning rate: {}", optimizer.learning_rate());
249
250 // Analyze training progression
251 let initial_loss = losses[0];
252 let final_loss = losses[losses.len() - 1];
253 let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
254
255 println!("\nTraining Analysis:");
256 println!(" Initial loss: {:.6}", initial_loss);
257 println!(" Final loss: {:.6}", final_loss);
258 println!(" Loss reduction: {:.1}%", loss_reduction);
259 println!(" Final weight norm: {:.6}", weight.norm().value());
260 println!(" Final bias: {:?}", bias.data());
261
262 Ok(())
263}
264
265/// Demonstrate learning rate scheduling
266fn demonstrate_learning_rate_scheduling() -> Result<(), Box<dyn std::error::Error>> {
267 println!("\n--- Learning Rate Scheduling ---");
268
269 // Create simple model
270 let mut weight = Tensor::randn(vec![1, 1], Some(45)).with_requires_grad();
271 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
272
273 // Create optimizer with high initial learning rate
274 let mut optimizer = Adam::with_learning_rate(0.1);
275 optimizer.add_parameter(&weight);
276 optimizer.add_parameter(&bias);
277
278 // Simple data
279 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0], vec![3, 1]).unwrap();
280 let y_true = Tensor::from_slice(&[2.0, 4.0, 6.0], vec![3, 1]).unwrap();
281
282 println!("Initial learning rate: {}", optimizer.learning_rate());
283
284 // Training loop with learning rate scheduling
285 let num_epochs = 50;
286 let mut losses = Vec::new();
287
288 for epoch in 0..num_epochs {
289 // Forward pass
290 let y_pred = x_data.matmul(&weight) + &bias;
291 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
292
293 // Backward pass
294 loss.backward(None);
295
296 // Optimizer step
297 optimizer.step(&mut [&mut weight, &mut bias]);
298 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
299
300 // Learning rate scheduling: reduce every 10 epochs
301 if epoch > 0 && epoch % 10 == 0 {
302 let current_lr = optimizer.learning_rate();
303 let new_lr = current_lr * 0.5;
304 optimizer.set_learning_rate(new_lr);
305 println!(
306 "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
307 epoch, current_lr, new_lr
308 );
309 }
310
311 losses.push(loss.value());
312
313 // Print progress
314 if epoch % 10 == 0 || epoch == num_epochs - 1 {
315 println!(
316 "Epoch {:2}: Loss = {:.6}, LR = {:.3}",
317 epoch,
318 loss.value(),
319 optimizer.learning_rate()
320 );
321 }
322 }
323
324 println!("Final learning rate: {}", optimizer.learning_rate());
325
326 Ok(())
327}
328
329/// Demonstrate training monitoring and analysis
330fn demonstrate_training_monitoring() -> Result<(), Box<dyn std::error::Error>> {
331 println!("\n--- Training Monitoring ---");
332
333 // Create model
334 let mut weight = Tensor::randn(vec![1, 1], Some(46)).with_requires_grad();
335 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
336
337 // Create optimizer
338 let mut optimizer = Adam::with_learning_rate(0.01);
339 optimizer.add_parameter(&weight);
340 optimizer.add_parameter(&bias);
341
342 // Training data
343 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![4, 1]).unwrap();
344 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0], vec![4, 1]).unwrap();
345
346 // Training loop with comprehensive monitoring
347 let num_epochs = 30;
348 let mut losses = Vec::new();
349 let mut weight_history = Vec::new();
350 let mut bias_history = Vec::new();
351
352 for epoch in 0..num_epochs {
353 // Forward pass
354 let y_pred = x_data.matmul(&weight) + &bias;
355 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
356
357 // Backward pass
358 loss.backward(None);
359
360 // Optimizer step
361 optimizer.step(&mut [&mut weight, &mut bias]);
362 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
363
364 // Record history
365 losses.push(loss.value());
366 weight_history.push(weight.value());
367 bias_history.push(bias.value());
368
369 // Print detailed monitoring
370 if epoch % 5 == 0 || epoch == num_epochs - 1 {
371 println!(
372 "Epoch {:2}: Loss = {:.6}, Weight = {:.6}, Bias = {:.6}",
373 epoch,
374 loss.value(),
375 weight.value(),
376 bias.value()
377 );
378 }
379 }
380
381 // Analyze training progression
382 println!("\nTraining Analysis:");
383 println!(" Initial loss: {:.6}", losses[0]);
384 println!(" Final loss: {:.6}", losses[losses.len() - 1]);
385 println!(
386 " Loss reduction: {:.1}%",
387 (losses[0] - losses[losses.len() - 1]) / losses[0] * 100.0
388 );
389
390 // Compute statistics
391 let loss_mean = compute_mean(&losses);
392 let loss_std = compute_std(&losses);
393 let weight_change = (weight_history[weight_history.len() - 1] - weight_history[0]).abs();
394 let bias_change = (bias_history[bias_history.len() - 1] - bias_history[0]).abs();
395
396 println!(" Average loss: {:.6} ± {:.6}", loss_mean, loss_std);
397 println!(" Weight change: {:.6}", weight_change);
398 println!(" Bias change: {:.6}", bias_change);
399 println!(" Final weight norm: {:.6}", weight.norm().value());
400 println!(" Final bias: {:.6}", bias.value());
401
402 Ok(())
403}430fn demonstrate_training_workflow() -> Result<(), Box<dyn std::error::Error>> {
431 println!("\n--- Training Workflow ---");
432
433 // Create a simple classification network
434 let config = FeedForwardConfig {
435 input_size: 2,
436 hidden_sizes: vec![4, 3],
437 output_size: 1,
438 use_bias: true,
439 };
440 let mut network = FeedForwardNetwork::new(config, Some(46));
441
442 println!("Training network: 2 -> [4, 3] -> 1");
443
444 // Create simple binary classification data: XOR problem
445 let x_data = Tensor::from_slice(
446 &[
447 0.0, 0.0, // -> 0
448 0.0, 1.0, // -> 1
449 1.0, 0.0, // -> 1
450 1.0, 1.0, // -> 0
451 ],
452 vec![4, 2],
453 )
454 .unwrap();
455
456 let y_true = Tensor::from_slice(&[0.0, 1.0, 1.0, 0.0], vec![4, 1]).unwrap();
457
458 println!("Training on XOR problem:");
459 println!(" Input shape: {:?}", x_data.shape().dims);
460 println!(" Target shape: {:?}", y_true.shape().dims);
461
462 // Create optimizer
463 let mut optimizer = Adam::with_learning_rate(0.1);
464 let params = network.parameters();
465 for param in ¶ms {
466 optimizer.add_parameter(param);
467 }
468
469 // Training loop
470 let num_epochs = 50;
471 let mut losses = Vec::new();
472
473 for epoch in 0..num_epochs {
474 // Forward pass
475 let y_pred = network.forward(&x_data);
476
477 // Compute loss: MSE
478 let diff = y_pred.sub_tensor(&y_true);
479 let mut loss = diff.pow_scalar(2.0).mean();
480
481 // Backward pass
482 loss.backward(None);
483
484 // Optimizer step and zero grad
485 let mut params = network.parameters();
486 optimizer.step(&mut params);
487 optimizer.zero_grad(&mut params);
488
489 losses.push(loss.value());
490
491 // Print progress
492 if epoch % 10 == 0 || epoch == num_epochs - 1 {
493 println!("Epoch {:2}: Loss = {:.6}", epoch, loss.value());
494 }
495 }
496
497 // Test final model
498 let final_predictions = network.forward_no_grad(&x_data);
499 println!("\nFinal predictions vs targets:");
500 for i in 0..4 {
501 let pred = final_predictions.data()[i];
502 let target = y_true.data()[i];
503 let input_x = x_data.data()[i * 2];
504 let input_y = x_data.data()[i * 2 + 1];
505 println!(
506 " [{:.0}, {:.0}] -> pred: {:.3}, target: {:.0}, error: {:.3}",
507 input_x,
508 input_y,
509 pred,
510 target,
511 (pred - target).abs()
512 );
513 }
514
515 Ok(())
516}
517
518/// Demonstrate comprehensive training with 100+ steps
519fn demonstrate_comprehensive_training() -> Result<(), Box<dyn std::error::Error>> {
520 println!("\n--- Comprehensive Training (100+ Steps) ---");
521
522 // Create a regression network
523 let config = FeedForwardConfig {
524 input_size: 3,
525 hidden_sizes: vec![8, 6, 4],
526 output_size: 2,
527 use_bias: true,
528 };
529 let mut network = FeedForwardNetwork::new(config, Some(47));
530
531 println!("Network architecture: 3 -> [8, 6, 4] -> 2");
532 println!("Total parameters: {}", network.parameter_count());
533
534 // Create synthetic regression data
535 // Target function: [y1, y2] = [x1 + 2*x2 - x3, x1*x2 + x3]
536 let num_samples = 32;
537 let mut x_vec = Vec::new();
538 let mut y_vec = Vec::new();
539
540 for i in 0..num_samples {
541 let x1 = (i as f32 / num_samples as f32) * 2.0 - 1.0; // [-1, 1]
542 let x2 = ((i * 2) as f32 / num_samples as f32) * 2.0 - 1.0;
543 let x3 = ((i * 3) as f32 / num_samples as f32) * 2.0 - 1.0;
544
545 let y1 = x1 + 2.0 * x2 - x3;
546 let y2 = x1 * x2 + x3;
547
548 x_vec.extend_from_slice(&[x1, x2, x3]);
549 y_vec.extend_from_slice(&[y1, y2]);
550 }
551
552 let x_data = Tensor::from_slice(&x_vec, vec![num_samples, 3]).unwrap();
553 let y_true = Tensor::from_slice(&y_vec, vec![num_samples, 2]).unwrap();
554
555 println!("Training data:");
556 println!(" {} samples", num_samples);
557 println!(" Input shape: {:?}", x_data.shape().dims);
558 println!(" Target shape: {:?}", y_true.shape().dims);
559
560 // Create optimizer with learning rate scheduling
561 let mut optimizer = Adam::with_learning_rate(0.01);
562 let params = network.parameters();
563 for param in ¶ms {
564 optimizer.add_parameter(param);
565 }
566
567 // Comprehensive training loop (150 epochs)
568 let num_epochs = 150;
569 let mut losses = Vec::new();
570 let mut best_loss = f32::INFINITY;
571 let mut patience_counter = 0;
572 let patience = 20;
573
574 println!("Starting comprehensive training...");
575
576 for epoch in 0..num_epochs {
577 // Forward pass
578 let y_pred = network.forward(&x_data);
579
580 // Compute loss: MSE
581 let diff = y_pred.sub_tensor(&y_true);
582 let mut loss = diff.pow_scalar(2.0).mean();
583
584 // Backward pass
585 loss.backward(None);
586
587 // Optimizer step and zero grad
588 let mut params = network.parameters();
589 optimizer.step(&mut params);
590 optimizer.zero_grad(&mut params);
591
592 let current_loss = loss.value();
593 losses.push(current_loss);
594
595 // Learning rate scheduling
596 if epoch > 0 && epoch % 30 == 0 {
597 let new_lr = optimizer.learning_rate() * 0.8;
598 optimizer.set_learning_rate(new_lr);
599 println!(" Reduced learning rate to {:.4}", new_lr);
600 }
601
602 // Early stopping logic
603 if current_loss < best_loss {
604 best_loss = current_loss;
605 patience_counter = 0;
606 } else {
607 patience_counter += 1;
608 }
609
610 // Print progress
611 if epoch % 25 == 0 || epoch == num_epochs - 1 {
612 println!(
613 "Epoch {:3}: Loss = {:.6}, LR = {:.4}, Best = {:.6}",
614 epoch,
615 current_loss,
616 optimizer.learning_rate(),
617 best_loss
618 );
619 }
620
621 // Early stopping
622 if patience_counter >= patience && epoch > 50 {
623 println!("Early stopping at epoch {} (patience exceeded)", epoch);
624 break;
625 }
626 }
627
628 // Final evaluation
629 let final_predictions = network.forward_no_grad(&x_data);
630
631 // Compute final metrics
632 let final_loss = losses[losses.len() - 1];
633 let initial_loss = losses[0];
634 let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
635
636 println!("\nTraining completed!");
637 println!(" Initial loss: {:.6}", initial_loss);
638 println!(" Final loss: {:.6}", final_loss);
639 println!(" Best loss: {:.6}", best_loss);
640 println!(" Loss reduction: {:.1}%", loss_reduction);
641 println!(" Final learning rate: {:.4}", optimizer.learning_rate());
642
643 // Sample predictions analysis
644 println!("\nSample predictions (first 5):");
645 for i in 0..5.min(num_samples) {
646 let pred1 = final_predictions.data()[i * 2];
647 let pred2 = final_predictions.data()[i * 2 + 1];
648 let true1 = y_true.data()[i * 2];
649 let true2 = y_true.data()[i * 2 + 1];
650
651 println!(
652 " Sample {}: pred=[{:.3}, {:.3}], true=[{:.3}, {:.3}], error=[{:.3}, {:.3}]",
653 i + 1,
654 pred1,
655 pred2,
656 true1,
657 true2,
658 (pred1 - true1).abs(),
659 (pred2 - true2).abs()
660 );
661 }
662
663 Ok(())
664}
665
666/// Demonstrate network serialization
667fn demonstrate_network_serialization() -> Result<(), Box<dyn std::error::Error>> {
668 println!("\n--- Network Serialization ---");
669
670 // Create and train a network
671 let config = FeedForwardConfig {
672 input_size: 2,
673 hidden_sizes: vec![4, 2],
674 output_size: 1,
675 use_bias: true,
676 };
677 let mut original_network = FeedForwardNetwork::new(config.clone(), Some(48));
678
679 // Quick training
680 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
681 let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
682
683 let mut optimizer = Adam::with_learning_rate(0.01);
684 let params = original_network.parameters();
685 for param in ¶ms {
686 optimizer.add_parameter(param);
687 }
688
689 for _ in 0..20 {
690 let y_pred = original_network.forward(&x_data);
691 let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
692 loss.backward(None);
693
694 let mut params = original_network.parameters();
695 optimizer.step(&mut params);
696 optimizer.zero_grad(&mut params);
697 }
698
699 // Test original network
700 let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
701 let original_output = original_network.forward_no_grad(&test_input);
702
703 println!("Original network output: {:?}", original_output.data());
704
705 // Save network
706 original_network.save_json("temp_feedforward_network")?;
707
708 // Load network
709 let loaded_network = FeedForwardNetwork::load_json("temp_feedforward_network", config)?;
710 let loaded_output = loaded_network.forward_no_grad(&test_input);
711
712 println!("Loaded network output: {:?}", loaded_output.data());
713
714 // Verify consistency
715 let match_check = original_output
716 .data()
717 .iter()
718 .zip(loaded_output.data().iter())
719 .all(|(a, b)| (a - b).abs() < 1e-6);
720
721 println!(
722 "Serialization verification: {}",
723 if match_check { "PASSED" } else { "FAILED" }
724 );
725
726 Ok(())
727}Sourcepub fn add_parameter(&mut self, parameter: &Tensor)
pub fn add_parameter(&mut self, parameter: &Tensor)
Add a single parameter to the optimizer
Links a parameter to the optimizer by creating a new parameter state
indexed by the tensor’s ID. The parameter must have requires_grad set to true.
§Arguments
parameter- Reference to the tensor to link
§Panics
Panics if the parameter does not have requires_grad set to true
Examples found in repository?
47fn demonstrate_basic_optimizer_setup() {
48 println!("--- Basic Optimizer Setup ---");
49
50 // Create parameters that require gradients
51 let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52 let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54 println!("Created parameters:");
55 println!(
56 " Weight: shape {:?}, requires_grad: {}",
57 weight.shape().dims,
58 weight.requires_grad()
59 );
60 println!(
61 " Bias: shape {:?}, requires_grad: {}",
62 bias.shape().dims,
63 bias.requires_grad()
64 );
65
66 // Create Adam optimizer with default configuration
67 let mut optimizer = Adam::new();
68 println!(
69 "Created Adam optimizer with learning rate: {}",
70 optimizer.learning_rate()
71 );
72
73 // Add parameters to optimizer
74 optimizer.add_parameter(&weight);
75 optimizer.add_parameter(&bias);
76 println!(
77 "Added {} parameters to optimizer",
78 optimizer.parameter_count()
79 );
80
81 // Create optimizer with custom configuration
82 let config = AdamConfig {
83 learning_rate: 0.01,
84 beta1: 0.9,
85 beta2: 0.999,
86 eps: 1e-8,
87 weight_decay: 0.0,
88 amsgrad: false,
89 };
90
91 let mut custom_optimizer = Adam::with_config(config);
92 custom_optimizer.add_parameter(&weight);
93 custom_optimizer.add_parameter(&bias);
94
95 println!(
96 "Created custom optimizer with learning rate: {}",
97 custom_optimizer.learning_rate()
98 );
99
100 // Demonstrate parameter linking
101 println!("Parameter linking completed successfully");
102}
103
104/// Demonstrate simple linear regression training
105fn demonstrate_linear_regression() -> Result<(), Box<dyn std::error::Error>> {
106 println!("\n--- Linear Regression Training ---");
107
108 // Create model parameters
109 let mut weight = Tensor::randn(vec![1, 1], Some(43)).with_requires_grad();
110 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
111
112 // Create optimizer
113 let mut optimizer = Adam::with_learning_rate(0.01);
114 optimizer.add_parameter(&weight);
115 optimizer.add_parameter(&bias);
116
117 // Create simple training data: y = 2*x + 1
118 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
119 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
120
121 println!("Training data:");
122 println!(" X: {:?}", x_data.data());
123 println!(" Y: {:?}", y_true.data());
124 println!(" Target: y = 2*x + 1");
125
126 // Training loop
127 let num_epochs = 100;
128 let mut losses = Vec::new();
129
130 for epoch in 0..num_epochs {
131 // Forward pass: y_pred = x * weight + bias
132 let y_pred = x_data.matmul(&weight) + &bias;
133
134 // Compute loss: MSE
135 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
136
137 // Backward pass
138 loss.backward(None);
139
140 // Optimizer step
141 optimizer.step(&mut [&mut weight, &mut bias]);
142 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
143
144 losses.push(loss.value());
145
146 // Print progress every 20 epochs
147 if epoch % 20 == 0 || epoch == num_epochs - 1 {
148 println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
149 }
150 }
151
152 // Evaluate final model
153 let final_predictions = x_data.matmul(&weight) + &bias;
154 println!("\nFinal model evaluation:");
155 println!(" Learned weight: {:.6}", weight.value());
156 println!(" Learned bias: {:.6}", bias.value());
157 println!(" Predictions vs True:");
158
159 for i in 0..5 {
160 let x1 = x_data.data()[i];
161 let pred = final_predictions.data()[i];
162 let true_val = y_true.data()[i];
163 println!(
164 " x={:.1}: pred={:.3}, true={:.1}, error={:.3}",
165 x1,
166 pred,
167 true_val,
168 (pred - true_val).abs()
169 );
170 }
171
172 Ok(())
173}
174
175/// Demonstrate advanced training patterns
176fn demonstrate_advanced_training() -> Result<(), Box<dyn std::error::Error>> {
177 println!("\n--- Advanced Training Patterns ---");
178
179 // Create a more complex model
180 let mut weight = Tensor::randn(vec![1, 2], Some(44)).with_requires_grad();
181 let mut bias = Tensor::zeros(vec![2]).with_requires_grad();
182
183 // Create optimizer with different learning rate
184 let mut optimizer = Adam::with_learning_rate(0.005);
185 optimizer.add_parameter(&weight);
186 optimizer.add_parameter(&bias);
187
188 // Create training data: y = 2*x + [1, 3]
189 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
190 let y_true = Tensor::from_slice(
191 &[3.0, 5.0, 7.0, 9.0, 11.0, 6.0, 8.0, 10.0, 12.0, 14.0],
192 vec![5, 2],
193 )
194 .unwrap();
195
196 println!("Advanced training with monitoring:");
197 println!(" Initial learning rate: {}", optimizer.learning_rate());
198
199 // Training loop with monitoring
200 let num_epochs = 50;
201 let mut losses = Vec::new();
202 let mut weight_norms = Vec::new();
203 let mut gradient_norms = Vec::new();
204
205 for epoch in 0..num_epochs {
206 // Forward pass
207 let y_pred = x_data.matmul(&weight) + &bias;
208 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
209
210 // Backward pass
211 loss.backward(None);
212
213 // Compute gradient norm before optimizer step
214 let gradient_norm = weight.grad_by_value().unwrap().norm();
215
216 // Optimizer step
217 optimizer.step(&mut [&mut weight, &mut bias]);
218 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
219
220 // Learning rate scheduling: reduce every 10 epochs
221 if epoch > 0 && epoch % 10 == 0 {
222 let current_lr = optimizer.learning_rate();
223 let new_lr = current_lr * 0.5;
224 optimizer.set_learning_rate(new_lr);
225 println!(
226 "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
227 epoch, current_lr, new_lr
228 );
229 }
230
231 // Record metrics
232 losses.push(loss.value());
233 weight_norms.push(weight.norm().value());
234 gradient_norms.push(gradient_norm.value());
235
236 // Print detailed progress
237 if epoch % 10 == 0 || epoch == num_epochs - 1 {
238 println!(
239 "Epoch {:2}: Loss = {:.6}, Weight Norm = {:.6}, Gradient Norm = {:.6}",
240 epoch,
241 loss.value(),
242 weight.norm().value(),
243 gradient_norm.value()
244 );
245 }
246 }
247
248 println!("Final learning rate: {}", optimizer.learning_rate());
249
250 // Analyze training progression
251 let initial_loss = losses[0];
252 let final_loss = losses[losses.len() - 1];
253 let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
254
255 println!("\nTraining Analysis:");
256 println!(" Initial loss: {:.6}", initial_loss);
257 println!(" Final loss: {:.6}", final_loss);
258 println!(" Loss reduction: {:.1}%", loss_reduction);
259 println!(" Final weight norm: {:.6}", weight.norm().value());
260 println!(" Final bias: {:?}", bias.data());
261
262 Ok(())
263}
264
265/// Demonstrate learning rate scheduling
266fn demonstrate_learning_rate_scheduling() -> Result<(), Box<dyn std::error::Error>> {
267 println!("\n--- Learning Rate Scheduling ---");
268
269 // Create simple model
270 let mut weight = Tensor::randn(vec![1, 1], Some(45)).with_requires_grad();
271 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
272
273 // Create optimizer with high initial learning rate
274 let mut optimizer = Adam::with_learning_rate(0.1);
275 optimizer.add_parameter(&weight);
276 optimizer.add_parameter(&bias);
277
278 // Simple data
279 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0], vec![3, 1]).unwrap();
280 let y_true = Tensor::from_slice(&[2.0, 4.0, 6.0], vec![3, 1]).unwrap();
281
282 println!("Initial learning rate: {}", optimizer.learning_rate());
283
284 // Training loop with learning rate scheduling
285 let num_epochs = 50;
286 let mut losses = Vec::new();
287
288 for epoch in 0..num_epochs {
289 // Forward pass
290 let y_pred = x_data.matmul(&weight) + &bias;
291 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
292
293 // Backward pass
294 loss.backward(None);
295
296 // Optimizer step
297 optimizer.step(&mut [&mut weight, &mut bias]);
298 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
299
300 // Learning rate scheduling: reduce every 10 epochs
301 if epoch > 0 && epoch % 10 == 0 {
302 let current_lr = optimizer.learning_rate();
303 let new_lr = current_lr * 0.5;
304 optimizer.set_learning_rate(new_lr);
305 println!(
306 "Epoch {:2}: Reduced learning rate from {:.3} to {:.3}",
307 epoch, current_lr, new_lr
308 );
309 }
310
311 losses.push(loss.value());
312
313 // Print progress
314 if epoch % 10 == 0 || epoch == num_epochs - 1 {
315 println!(
316 "Epoch {:2}: Loss = {:.6}, LR = {:.3}",
317 epoch,
318 loss.value(),
319 optimizer.learning_rate()
320 );
321 }
322 }
323
324 println!("Final learning rate: {}", optimizer.learning_rate());
325
326 Ok(())
327}
328
329/// Demonstrate training monitoring and analysis
330fn demonstrate_training_monitoring() -> Result<(), Box<dyn std::error::Error>> {
331 println!("\n--- Training Monitoring ---");
332
333 // Create model
334 let mut weight = Tensor::randn(vec![1, 1], Some(46)).with_requires_grad();
335 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
336
337 // Create optimizer
338 let mut optimizer = Adam::with_learning_rate(0.01);
339 optimizer.add_parameter(&weight);
340 optimizer.add_parameter(&bias);
341
342 // Training data
343 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![4, 1]).unwrap();
344 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0], vec![4, 1]).unwrap();
345
346 // Training loop with comprehensive monitoring
347 let num_epochs = 30;
348 let mut losses = Vec::new();
349 let mut weight_history = Vec::new();
350 let mut bias_history = Vec::new();
351
352 for epoch in 0..num_epochs {
353 // Forward pass
354 let y_pred = x_data.matmul(&weight) + &bias;
355 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
356
357 // Backward pass
358 loss.backward(None);
359
360 // Optimizer step
361 optimizer.step(&mut [&mut weight, &mut bias]);
362 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
363
364 // Record history
365 losses.push(loss.value());
366 weight_history.push(weight.value());
367 bias_history.push(bias.value());
368
369 // Print detailed monitoring
370 if epoch % 5 == 0 || epoch == num_epochs - 1 {
371 println!(
372 "Epoch {:2}: Loss = {:.6}, Weight = {:.6}, Bias = {:.6}",
373 epoch,
374 loss.value(),
375 weight.value(),
376 bias.value()
377 );
378 }
379 }
380
381 // Analyze training progression
382 println!("\nTraining Analysis:");
383 println!(" Initial loss: {:.6}", losses[0]);
384 println!(" Final loss: {:.6}", losses[losses.len() - 1]);
385 println!(
386 " Loss reduction: {:.1}%",
387 (losses[0] - losses[losses.len() - 1]) / losses[0] * 100.0
388 );
389
390 // Compute statistics
391 let loss_mean = compute_mean(&losses);
392 let loss_std = compute_std(&losses);
393 let weight_change = (weight_history[weight_history.len() - 1] - weight_history[0]).abs();
394 let bias_change = (bias_history[bias_history.len() - 1] - bias_history[0]).abs();
395
396 println!(" Average loss: {:.6} ± {:.6}", loss_mean, loss_std);
397 println!(" Weight change: {:.6}", weight_change);
398 println!(" Bias change: {:.6}", bias_change);
399 println!(" Final weight norm: {:.6}", weight.norm().value());
400 println!(" Final bias: {:.6}", bias.value());
401
402 Ok(())
403}More examples
109fn demonstrate_optimizer_serialization() -> Result<(), Box<dyn std::error::Error>> {
110 println!("\n--- Optimizer Serialization ---");
111
112 // Create an optimizer with some parameters
113 let mut weight = Tensor::randn(vec![2, 2], Some(42)).with_requires_grad();
114 let mut bias = Tensor::randn(vec![2], Some(43)).with_requires_grad();
115
116 let config = AdamConfig {
117 learning_rate: 0.001,
118 beta1: 0.9,
119 beta2: 0.999,
120 eps: 1e-8,
121 weight_decay: 0.0,
122 amsgrad: false,
123 };
124
125 let mut optimizer = Adam::with_config(config);
126 optimizer.add_parameter(&weight);
127 optimizer.add_parameter(&bias);
128
129 println!(
130 "Created optimizer with {} parameters",
131 optimizer.parameter_count()
132 );
133 println!("Learning rate: {}", optimizer.learning_rate());
134
135 // Simulate some training steps
136 for _ in 0..3 {
137 let mut loss = weight.sum() + bias.sum();
138 loss.backward(None);
139 optimizer.step(&mut [&mut weight, &mut bias]);
140 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
141 }
142
143 // Save optimizer state
144 let optimizer_path = "temp_optimizer.json";
145 optimizer.save_json(optimizer_path)?;
146 println!("Saved optimizer to: {}", optimizer_path);
147
148 // Load optimizer state
149 let loaded_optimizer = Adam::load_json(optimizer_path)?;
150 println!(
151 "Loaded optimizer with {} parameters",
152 loaded_optimizer.parameter_count()
153 );
154 println!("Learning rate: {}", loaded_optimizer.learning_rate());
155
156 // Verify optimizer state
157 assert_eq!(
158 optimizer.parameter_count(),
159 loaded_optimizer.parameter_count()
160 );
161 assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
162 println!("Optimizer serialization verification: PASSED");
163
164 Ok(())
165}
166
167/// Demonstrate format comparison and performance characteristics
168fn demonstrate_format_comparison() -> Result<(), Box<dyn std::error::Error>> {
169 println!("\n--- Format Comparison ---");
170
171 // Create a larger tensor for comparison
172 let tensor = Tensor::randn(vec![10, 10], Some(44));
173
174 // Save in both formats
175 tensor.save_json("temp_comparison.json")?;
176 tensor.save_binary("temp_comparison.bin")?;
177
178 // Compare file sizes
179 let json_size = fs::metadata("temp_comparison.json")?.len();
180 let binary_size = fs::metadata("temp_comparison.bin")?.len();
181
182 println!("JSON file size: {} bytes", json_size);
183 println!("Binary file size: {} bytes", binary_size);
184 println!(
185 "Compression ratio: {:.2}x",
186 json_size as f64 / binary_size as f64
187 );
188
189 // Load and verify both formats
190 let json_tensor = Tensor::load_json("temp_comparison.json")?;
191 let binary_tensor = Tensor::load_binary("temp_comparison.bin")?;
192
193 assert_eq!(tensor.shape().dims, json_tensor.shape().dims);
194 assert_eq!(tensor.shape().dims, binary_tensor.shape().dims);
195 assert_eq!(tensor.data(), json_tensor.data());
196 assert_eq!(tensor.data(), binary_tensor.data());
197
198 println!("Format comparison verification: PASSED");
199
200 Ok(())
201}
202
203/// Demonstrate a basic model checkpointing workflow
204fn demonstrate_model_checkpointing() -> Result<(), Box<dyn std::error::Error>> {
205 println!("\n--- Model Checkpointing ---");
206
207 // Create a simple model (weights and bias)
208 let mut weights = Tensor::randn(vec![2, 1], Some(45)).with_requires_grad();
209 let mut bias = Tensor::randn(vec![1], Some(46)).with_requires_grad();
210
211 // Create optimizer
212 let mut optimizer = Adam::with_learning_rate(0.01);
213 optimizer.add_parameter(&weights);
214 optimizer.add_parameter(&bias);
215
216 println!("Initial weights: {:?}", weights.data());
217 println!("Initial bias: {:?}", bias.data());
218
219 // Simulate training
220 for epoch in 0..5 {
221 let mut loss = weights.sum() + bias.sum();
222 loss.backward(None);
223 optimizer.step(&mut [&mut weights, &mut bias]);
224 optimizer.zero_grad(&mut [&mut weights, &mut bias]);
225
226 if epoch % 2 == 0 {
227 // Save checkpoint
228 let checkpoint_dir = format!("checkpoint_epoch_{}", epoch);
229 fs::create_dir_all(&checkpoint_dir)?;
230
231 weights.save_json(format!("{}/weights.json", checkpoint_dir))?;
232 bias.save_json(format!("{}/bias.json", checkpoint_dir))?;
233 optimizer.save_json(format!("{}/optimizer.json", checkpoint_dir))?;
234
235 println!("Saved checkpoint for epoch {}", epoch);
236 }
237 }
238
239 // Load from checkpoint
240 let loaded_weights = Tensor::load_json("checkpoint_epoch_4/weights.json")?;
241 let loaded_bias = Tensor::load_json("checkpoint_epoch_4/bias.json")?;
242 let loaded_optimizer = Adam::load_json("checkpoint_epoch_4/optimizer.json")?;
243
244 println!("Loaded weights: {:?}", loaded_weights.data());
245 println!("Loaded bias: {:?}", loaded_bias.data());
246 println!(
247 "Loaded optimizer learning rate: {}",
248 loaded_optimizer.learning_rate()
249 );
250
251 // Verify checkpoint integrity
252 assert_eq!(weights.shape().dims, loaded_weights.shape().dims);
253 assert_eq!(bias.shape().dims, loaded_bias.shape().dims);
254 assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
255
256 println!("Checkpointing verification: PASSED");
257
258 Ok(())
259}84fn demonstrate_default_adam() -> Result<(), Box<dyn std::error::Error>> {
85 println!("--- Default Adam Configuration ---");
86
87 // Create a simple regression problem: y = 2*x + 1
88 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
89 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
90
91 // Create model parameters
92 let mut weight = Tensor::randn(vec![1, 1], Some(42)).with_requires_grad();
93 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
94
95 // Create Adam optimizer with default configuration
96 let mut optimizer = Adam::new();
97 optimizer.add_parameter(&weight);
98 optimizer.add_parameter(&bias);
99
100 println!("Default Adam configuration:");
101 println!(" Learning rate: {}", optimizer.learning_rate());
102 println!(" Initial weight: {:.6}", weight.value());
103 println!(" Initial bias: {:.6}", bias.value());
104
105 // Training loop
106 let num_epochs = 50;
107 let mut losses = Vec::new();
108
109 for epoch in 0..num_epochs {
110 // Forward pass
111 let y_pred = x_data.matmul(&weight) + &bias;
112 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
113
114 // Backward pass
115 loss.backward(None);
116
117 // Optimizer step
118 optimizer.step(&mut [&mut weight, &mut bias]);
119 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
120
121 losses.push(loss.value());
122
123 if epoch % 10 == 0 || epoch == num_epochs - 1 {
124 println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
125 }
126 }
127
128 // Evaluate final model
129 let _final_predictions = x_data.matmul(&weight) + &bias;
130 println!("\nFinal model:");
131 println!(" Learned weight: {:.6} (target: 2.0)", weight.value());
132 println!(" Learned bias: {:.6} (target: 1.0)", bias.value());
133 println!(" Final loss: {:.6}", losses[losses.len() - 1]);
134
135 Ok(())
136}
137
138/// Demonstrate learning rate comparison
139fn demonstrate_learning_rate_comparison() -> Result<(), Box<dyn std::error::Error>> {
140 println!("\n--- Learning Rate Comparison ---");
141
142 let learning_rates = [0.001, 0.01, 0.1];
143 let mut results = Vec::new();
144
145 for &lr in &learning_rates {
146 println!("\nTesting learning rate: {}", lr);
147
148 let stats = train_with_config(TrainingConfig {
149 learning_rate: lr,
150 ..Default::default()
151 })?;
152
153 results.push((lr, stats.clone()));
154
155 println!(" Final loss: {:.6}", stats.final_loss);
156 println!(" Convergence epoch: {}", stats.convergence_epoch);
157 }
158
159 // Compare results
160 println!("\nLearning Rate Comparison Summary:");
161 for (lr, stats) in &results {
162 println!(
163 " LR={:6}: Loss={:.6}, Converged@{}",
164 lr, stats.final_loss, stats.convergence_epoch
165 );
166 }
167
168 Ok(())
169}
170
171/// Demonstrate weight decay comparison
172fn demonstrate_weight_decay_comparison() -> Result<(), Box<dyn std::error::Error>> {
173 println!("\n--- Weight Decay Comparison ---");
174
175 let weight_decays = [0.0, 0.001, 0.01];
176 let mut results = Vec::new();
177
178 for &wd in &weight_decays {
179 println!("\nTesting weight decay: {}", wd);
180
181 let stats = train_with_config(TrainingConfig {
182 weight_decay: wd,
183 ..Default::default()
184 })?;
185
186 results.push((wd, stats.clone()));
187
188 println!(" Final loss: {:.6}", stats.final_loss);
189 println!(" Final weight norm: {:.6}", stats.weight_norm);
190 }
191
192 // Compare results
193 println!("\nWeight Decay Comparison Summary:");
194 for (wd, stats) in &results {
195 println!(
196 " WD={:6}: Loss={:.6}, Weight Norm={:.6}",
197 wd, stats.final_loss, stats.weight_norm
198 );
199 }
200
201 Ok(())
202}
203
204/// Demonstrate beta parameter tuning
205fn demonstrate_beta_parameter_tuning() -> Result<(), Box<dyn std::error::Error>> {
206 println!("\n--- Beta Parameter Tuning ---");
207
208 let beta_configs = [
209 (0.9, 0.999), // Default
210 (0.8, 0.999), // More aggressive momentum
211 (0.95, 0.999), // Less aggressive momentum
212 (0.9, 0.99), // Faster second moment decay
213 ];
214
215 let mut results = Vec::new();
216
217 for (i, (beta1, beta2)) in beta_configs.iter().enumerate() {
218 println!(
219 "\nTesting beta configuration {}: beta1={}, beta2={}",
220 i + 1,
221 beta1,
222 beta2
223 );
224
225 let config = TrainingConfig {
226 beta1: *beta1,
227 beta2: *beta2,
228 ..Default::default()
229 };
230
231 let stats = train_with_config(config)?;
232 results.push(((*beta1, *beta2), stats.clone()));
233
234 println!(" Final loss: {:.6}", stats.final_loss);
235 println!(" Convergence epoch: {}", stats.convergence_epoch);
236 }
237
238 // Compare results
239 println!("\nBeta Parameter Comparison Summary:");
240 for ((beta1, beta2), stats) in &results {
241 println!(
242 " B1={:4}, B2={:5}: Loss={:.6}, Converged@{}",
243 beta1, beta2, stats.final_loss, stats.convergence_epoch
244 );
245 }
246
247 Ok(())
248}
249
250/// Demonstrate configuration benchmarking
251fn demonstrate_configuration_benchmarking() -> Result<(), Box<dyn std::error::Error>> {
252 println!("\n--- Configuration Benchmarking ---");
253
254 // Define configurations to benchmark
255 let configs = vec![
256 (
257 "Conservative",
258 TrainingConfig {
259 learning_rate: 0.001,
260 weight_decay: 0.001,
261 beta1: 0.95,
262 ..Default::default()
263 },
264 ),
265 (
266 "Balanced",
267 TrainingConfig {
268 learning_rate: 0.01,
269 weight_decay: 0.0,
270 beta1: 0.9,
271 ..Default::default()
272 },
273 ),
274 (
275 "Aggressive",
276 TrainingConfig {
277 learning_rate: 0.1,
278 weight_decay: 0.0,
279 beta1: 0.8,
280 ..Default::default()
281 },
282 ),
283 ];
284
285 let mut benchmark_results = Vec::new();
286
287 for (name, config) in configs {
288 println!("\nBenchmarking {} configuration:", name);
289
290 let start_time = std::time::Instant::now();
291 let stats = train_with_config(config.clone())?;
292 let elapsed = start_time.elapsed();
293
294 println!(" Training time: {:.2}ms", elapsed.as_millis());
295 println!(" Final loss: {:.6}", stats.final_loss);
296 println!(" Convergence: {} epochs", stats.convergence_epoch);
297
298 benchmark_results.push((name.to_string(), stats, elapsed));
299 }
300
301 // Summary
302 println!("\nBenchmarking Summary:");
303 for (name, stats, elapsed) in &benchmark_results {
304 println!(
305 " {:12}: Loss={:.6}, Time={:4}ms, Converged@{}",
306 name,
307 stats.final_loss,
308 elapsed.as_millis(),
309 stats.convergence_epoch
310 );
311 }
312
313 Ok(())
314}
315
316/// Helper function to train with specific configuration
317fn train_with_config(config: TrainingConfig) -> Result<TrainingStats, Box<dyn std::error::Error>> {
318 // Create training data
319 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
320 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
321
322 // Create model parameters
323 let mut weight = Tensor::randn(vec![1, 1], Some(123)).with_requires_grad();
324 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
325
326 // Create optimizer with custom configuration
327 let adam_config = AdamConfig {
328 learning_rate: config.learning_rate,
329 beta1: config.beta1,
330 beta2: config.beta2,
331 eps: 1e-8,
332 weight_decay: config.weight_decay,
333 amsgrad: false,
334 };
335
336 let mut optimizer = Adam::with_config(adam_config);
337 optimizer.add_parameter(&weight);
338 optimizer.add_parameter(&bias);
339
340 // Training loop
341 let mut losses = Vec::new();
342 let mut convergence_epoch = config.epochs;
343
344 for epoch in 0..config.epochs {
345 // Forward pass
346 let y_pred = x_data.matmul(&weight) + &bias;
347 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
348
349 // Backward pass
350 loss.backward(None);
351
352 // Optimizer step
353 optimizer.step(&mut [&mut weight, &mut bias]);
354 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
355
356 let loss_value = loss.value();
357 losses.push(loss_value);
358
359 // Check for convergence (loss < 0.01)
360 if loss_value < 0.01 && convergence_epoch == config.epochs {
361 convergence_epoch = epoch;
362 }
363 }
364
365 Ok(TrainingStats {
366 config,
367 final_loss: losses[losses.len() - 1],
368 loss_history: losses,
369 convergence_epoch,
370 weight_norm: weight.norm().value(),
371 })
372}319fn train_with_scheduler(
320 scheduler: &mut dyn LearningRateScheduler,
321 num_epochs: usize,
322) -> Result<TrainingStats, Box<dyn std::error::Error>> {
323 // Create training data: y = 2*x + 1
324 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0], vec![5, 1]).unwrap();
325 let y_true = Tensor::from_slice(&[3.0, 5.0, 7.0, 9.0, 11.0], vec![5, 1]).unwrap();
326
327 // Create model parameters
328 let mut weight = Tensor::randn(vec![1, 1], Some(456)).with_requires_grad();
329 let mut bias = Tensor::zeros(vec![1]).with_requires_grad();
330
331 // Create optimizer with initial learning rate
332 let mut optimizer = Adam::with_learning_rate(0.05);
333 optimizer.add_parameter(&weight);
334 optimizer.add_parameter(&bias);
335
336 // Training loop
337 let mut losses = Vec::new();
338 let mut lr_history = Vec::new();
339 let mut convergence_epoch = num_epochs;
340
341 for epoch in 0..num_epochs {
342 // Forward pass
343 let y_pred = x_data.matmul(&weight) + &bias;
344 let mut loss = (&y_pred - &y_true).pow_scalar(2.0).mean();
345
346 // Backward pass
347 loss.backward(None);
348
349 // Update learning rate using scheduler
350 let current_lr = optimizer.learning_rate();
351 let new_lr = scheduler.step(current_lr, epoch, loss.value());
352
353 if (new_lr - current_lr).abs() > 1e-8 {
354 optimizer.set_learning_rate(new_lr);
355 }
356
357 // Optimizer step
358 optimizer.step(&mut [&mut weight, &mut bias]);
359 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
360
361 let loss_value = loss.value();
362 losses.push(loss_value);
363 lr_history.push(new_lr);
364
365 // Check for convergence
366 if loss_value < 0.01 && convergence_epoch == num_epochs {
367 convergence_epoch = epoch;
368 }
369 }
370
371 Ok(TrainingStats {
372 scheduler_name: scheduler.name().to_string(),
373 final_loss: losses[losses.len() - 1],
374 lr_history,
375 loss_history: losses,
376 convergence_epoch,
377 })
378}430fn demonstrate_training_workflow() -> Result<(), Box<dyn std::error::Error>> {
431 println!("\n--- Training Workflow ---");
432
433 // Create a simple classification network
434 let config = FeedForwardConfig {
435 input_size: 2,
436 hidden_sizes: vec![4, 3],
437 output_size: 1,
438 use_bias: true,
439 };
440 let mut network = FeedForwardNetwork::new(config, Some(46));
441
442 println!("Training network: 2 -> [4, 3] -> 1");
443
444 // Create simple binary classification data: XOR problem
445 let x_data = Tensor::from_slice(
446 &[
447 0.0, 0.0, // -> 0
448 0.0, 1.0, // -> 1
449 1.0, 0.0, // -> 1
450 1.0, 1.0, // -> 0
451 ],
452 vec![4, 2],
453 )
454 .unwrap();
455
456 let y_true = Tensor::from_slice(&[0.0, 1.0, 1.0, 0.0], vec![4, 1]).unwrap();
457
458 println!("Training on XOR problem:");
459 println!(" Input shape: {:?}", x_data.shape().dims);
460 println!(" Target shape: {:?}", y_true.shape().dims);
461
462 // Create optimizer
463 let mut optimizer = Adam::with_learning_rate(0.1);
464 let params = network.parameters();
465 for param in ¶ms {
466 optimizer.add_parameter(param);
467 }
468
469 // Training loop
470 let num_epochs = 50;
471 let mut losses = Vec::new();
472
473 for epoch in 0..num_epochs {
474 // Forward pass
475 let y_pred = network.forward(&x_data);
476
477 // Compute loss: MSE
478 let diff = y_pred.sub_tensor(&y_true);
479 let mut loss = diff.pow_scalar(2.0).mean();
480
481 // Backward pass
482 loss.backward(None);
483
484 // Optimizer step and zero grad
485 let mut params = network.parameters();
486 optimizer.step(&mut params);
487 optimizer.zero_grad(&mut params);
488
489 losses.push(loss.value());
490
491 // Print progress
492 if epoch % 10 == 0 || epoch == num_epochs - 1 {
493 println!("Epoch {:2}: Loss = {:.6}", epoch, loss.value());
494 }
495 }
496
497 // Test final model
498 let final_predictions = network.forward_no_grad(&x_data);
499 println!("\nFinal predictions vs targets:");
500 for i in 0..4 {
501 let pred = final_predictions.data()[i];
502 let target = y_true.data()[i];
503 let input_x = x_data.data()[i * 2];
504 let input_y = x_data.data()[i * 2 + 1];
505 println!(
506 " [{:.0}, {:.0}] -> pred: {:.3}, target: {:.0}, error: {:.3}",
507 input_x,
508 input_y,
509 pred,
510 target,
511 (pred - target).abs()
512 );
513 }
514
515 Ok(())
516}
517
518/// Demonstrate comprehensive training with 100+ steps
519fn demonstrate_comprehensive_training() -> Result<(), Box<dyn std::error::Error>> {
520 println!("\n--- Comprehensive Training (100+ Steps) ---");
521
522 // Create a regression network
523 let config = FeedForwardConfig {
524 input_size: 3,
525 hidden_sizes: vec![8, 6, 4],
526 output_size: 2,
527 use_bias: true,
528 };
529 let mut network = FeedForwardNetwork::new(config, Some(47));
530
531 println!("Network architecture: 3 -> [8, 6, 4] -> 2");
532 println!("Total parameters: {}", network.parameter_count());
533
534 // Create synthetic regression data
535 // Target function: [y1, y2] = [x1 + 2*x2 - x3, x1*x2 + x3]
536 let num_samples = 32;
537 let mut x_vec = Vec::new();
538 let mut y_vec = Vec::new();
539
540 for i in 0..num_samples {
541 let x1 = (i as f32 / num_samples as f32) * 2.0 - 1.0; // [-1, 1]
542 let x2 = ((i * 2) as f32 / num_samples as f32) * 2.0 - 1.0;
543 let x3 = ((i * 3) as f32 / num_samples as f32) * 2.0 - 1.0;
544
545 let y1 = x1 + 2.0 * x2 - x3;
546 let y2 = x1 * x2 + x3;
547
548 x_vec.extend_from_slice(&[x1, x2, x3]);
549 y_vec.extend_from_slice(&[y1, y2]);
550 }
551
552 let x_data = Tensor::from_slice(&x_vec, vec![num_samples, 3]).unwrap();
553 let y_true = Tensor::from_slice(&y_vec, vec![num_samples, 2]).unwrap();
554
555 println!("Training data:");
556 println!(" {} samples", num_samples);
557 println!(" Input shape: {:?}", x_data.shape().dims);
558 println!(" Target shape: {:?}", y_true.shape().dims);
559
560 // Create optimizer with learning rate scheduling
561 let mut optimizer = Adam::with_learning_rate(0.01);
562 let params = network.parameters();
563 for param in ¶ms {
564 optimizer.add_parameter(param);
565 }
566
567 // Comprehensive training loop (150 epochs)
568 let num_epochs = 150;
569 let mut losses = Vec::new();
570 let mut best_loss = f32::INFINITY;
571 let mut patience_counter = 0;
572 let patience = 20;
573
574 println!("Starting comprehensive training...");
575
576 for epoch in 0..num_epochs {
577 // Forward pass
578 let y_pred = network.forward(&x_data);
579
580 // Compute loss: MSE
581 let diff = y_pred.sub_tensor(&y_true);
582 let mut loss = diff.pow_scalar(2.0).mean();
583
584 // Backward pass
585 loss.backward(None);
586
587 // Optimizer step and zero grad
588 let mut params = network.parameters();
589 optimizer.step(&mut params);
590 optimizer.zero_grad(&mut params);
591
592 let current_loss = loss.value();
593 losses.push(current_loss);
594
595 // Learning rate scheduling
596 if epoch > 0 && epoch % 30 == 0 {
597 let new_lr = optimizer.learning_rate() * 0.8;
598 optimizer.set_learning_rate(new_lr);
599 println!(" Reduced learning rate to {:.4}", new_lr);
600 }
601
602 // Early stopping logic
603 if current_loss < best_loss {
604 best_loss = current_loss;
605 patience_counter = 0;
606 } else {
607 patience_counter += 1;
608 }
609
610 // Print progress
611 if epoch % 25 == 0 || epoch == num_epochs - 1 {
612 println!(
613 "Epoch {:3}: Loss = {:.6}, LR = {:.4}, Best = {:.6}",
614 epoch,
615 current_loss,
616 optimizer.learning_rate(),
617 best_loss
618 );
619 }
620
621 // Early stopping
622 if patience_counter >= patience && epoch > 50 {
623 println!("Early stopping at epoch {} (patience exceeded)", epoch);
624 break;
625 }
626 }
627
628 // Final evaluation
629 let final_predictions = network.forward_no_grad(&x_data);
630
631 // Compute final metrics
632 let final_loss = losses[losses.len() - 1];
633 let initial_loss = losses[0];
634 let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
635
636 println!("\nTraining completed!");
637 println!(" Initial loss: {:.6}", initial_loss);
638 println!(" Final loss: {:.6}", final_loss);
639 println!(" Best loss: {:.6}", best_loss);
640 println!(" Loss reduction: {:.1}%", loss_reduction);
641 println!(" Final learning rate: {:.4}", optimizer.learning_rate());
642
643 // Sample predictions analysis
644 println!("\nSample predictions (first 5):");
645 for i in 0..5.min(num_samples) {
646 let pred1 = final_predictions.data()[i * 2];
647 let pred2 = final_predictions.data()[i * 2 + 1];
648 let true1 = y_true.data()[i * 2];
649 let true2 = y_true.data()[i * 2 + 1];
650
651 println!(
652 " Sample {}: pred=[{:.3}, {:.3}], true=[{:.3}, {:.3}], error=[{:.3}, {:.3}]",
653 i + 1,
654 pred1,
655 pred2,
656 true1,
657 true2,
658 (pred1 - true1).abs(),
659 (pred2 - true2).abs()
660 );
661 }
662
663 Ok(())
664}
665
666/// Demonstrate network serialization
667fn demonstrate_network_serialization() -> Result<(), Box<dyn std::error::Error>> {
668 println!("\n--- Network Serialization ---");
669
670 // Create and train a network
671 let config = FeedForwardConfig {
672 input_size: 2,
673 hidden_sizes: vec![4, 2],
674 output_size: 1,
675 use_bias: true,
676 };
677 let mut original_network = FeedForwardNetwork::new(config.clone(), Some(48));
678
679 // Quick training
680 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
681 let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
682
683 let mut optimizer = Adam::with_learning_rate(0.01);
684 let params = original_network.parameters();
685 for param in ¶ms {
686 optimizer.add_parameter(param);
687 }
688
689 for _ in 0..20 {
690 let y_pred = original_network.forward(&x_data);
691 let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
692 loss.backward(None);
693
694 let mut params = original_network.parameters();
695 optimizer.step(&mut params);
696 optimizer.zero_grad(&mut params);
697 }
698
699 // Test original network
700 let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
701 let original_output = original_network.forward_no_grad(&test_input);
702
703 println!("Original network output: {:?}", original_output.data());
704
705 // Save network
706 original_network.save_json("temp_feedforward_network")?;
707
708 // Load network
709 let loaded_network = FeedForwardNetwork::load_json("temp_feedforward_network", config)?;
710 let loaded_output = loaded_network.forward_no_grad(&test_input);
711
712 println!("Loaded network output: {:?}", loaded_output.data());
713
714 // Verify consistency
715 let match_check = original_output
716 .data()
717 .iter()
718 .zip(loaded_output.data().iter())
719 .all(|(a, b)| (a - b).abs() < 1e-6);
720
721 println!(
722 "Serialization verification: {}",
723 if match_check { "PASSED" } else { "FAILED" }
724 );
725
726 Ok(())
727}217fn demonstrate_training_loop() -> Result<(), Box<dyn std::error::Error>> {
218 println!("\n--- Training Loop ---");
219
220 // Create layer and training data
221 let mut layer = LinearLayer::new(2, 1, Some(45));
222
223 // Simple regression task: y = 2*x1 + 3*x2 + 1
224 let x_data = Tensor::from_slice(
225 &[
226 1.0, 1.0, // x1=1, x2=1 -> y=6
227 2.0, 1.0, // x1=2, x2=1 -> y=8
228 1.0, 2.0, // x1=1, x2=2 -> y=9
229 2.0, 2.0, // x1=2, x2=2 -> y=11
230 ],
231 vec![4, 2],
232 )
233 .unwrap();
234
235 let y_true = Tensor::from_slice(&[6.0, 8.0, 9.0, 11.0], vec![4, 1]).unwrap();
236
237 println!("Training data:");
238 println!(" X shape: {:?}", x_data.shape().dims);
239 println!(" Y shape: {:?}", y_true.shape().dims);
240 println!(" Target function: y = 2*x1 + 3*x2 + 1");
241
242 // Create optimizer
243 let config = AdamConfig {
244 learning_rate: 0.01,
245 beta1: 0.9,
246 beta2: 0.999,
247 eps: 1e-8,
248 weight_decay: 0.0,
249 amsgrad: false,
250 };
251
252 let mut optimizer = Adam::with_config(config);
253 let params = layer.parameters();
254 for param in ¶ms {
255 optimizer.add_parameter(param);
256 }
257
258 println!("Optimizer setup complete. Starting training...");
259
260 // Training loop
261 let num_epochs = 100;
262 let mut losses = Vec::new();
263
264 for epoch in 0..num_epochs {
265 // Forward pass
266 let y_pred = layer.forward(&x_data);
267
268 // Compute loss: MSE
269 let diff = y_pred.sub_tensor(&y_true);
270 let mut loss = diff.pow_scalar(2.0).mean();
271
272 // Backward pass
273 loss.backward(None);
274
275 // Optimizer step
276 let mut params = layer.parameters();
277 optimizer.step(&mut params);
278 optimizer.zero_grad(&mut params);
279
280 losses.push(loss.value());
281
282 // Print progress
283 if epoch % 20 == 0 || epoch == num_epochs - 1 {
284 println!("Epoch {:3}: Loss = {:.6}", epoch, loss.value());
285 }
286 }
287
288 // Evaluate final model
289 let final_predictions = layer.forward_no_grad(&x_data);
290
291 println!("\nFinal model evaluation:");
292 println!(" Learned weights: {:?}", layer.weight.data());
293 println!(" Learned bias: {:?}", layer.bias.data());
294 println!(" Target weights: [2.0, 3.0]");
295 println!(" Target bias: [1.0]");
296
297 println!(" Predictions vs True:");
298 for i in 0..4 {
299 let pred = final_predictions.data()[i];
300 let true_val = y_true.data()[i];
301 println!(
302 " Sample {}: pred={:.3}, true={:.1}, error={:.3}",
303 i + 1,
304 pred,
305 true_val,
306 (pred - true_val).abs()
307 );
308 }
309
310 // Training analysis
311 let initial_loss = losses[0];
312 let final_loss = losses[losses.len() - 1];
313 let loss_reduction = (initial_loss - final_loss) / initial_loss * 100.0;
314
315 println!("\nTraining Analysis:");
316 println!(" Initial loss: {:.6}", initial_loss);
317 println!(" Final loss: {:.6}", final_loss);
318 println!(" Loss reduction: {:.1}%", loss_reduction);
319
320 Ok(())
321}
322
323/// Demonstrate single vs batch inference
324fn demonstrate_single_vs_batch_inference() {
325 println!("\n--- Single vs Batch Inference ---");
326
327 let layer = LinearLayer::new(4, 3, Some(46));
328
329 // Single inference
330 println!("Single inference:");
331 let single_input = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![1, 4]).unwrap();
332 let single_output = layer.forward_no_grad(&single_input);
333 println!(" Input shape: {:?}", single_input.shape().dims);
334 println!(" Output shape: {:?}", single_output.shape().dims);
335 println!(" Output: {:?}", single_output.data());
336
337 // Batch inference
338 println!("Batch inference:");
339 let batch_input = Tensor::from_slice(
340 &[
341 1.0, 2.0, 3.0, 4.0, // Sample 1
342 5.0, 6.0, 7.0, 8.0, // Sample 2
343 9.0, 10.0, 11.0, 12.0, // Sample 3
344 ],
345 vec![3, 4],
346 )
347 .unwrap();
348 let batch_output = layer.forward_no_grad(&batch_input);
349 println!(" Input shape: {:?}", batch_input.shape().dims);
350 println!(" Output shape: {:?}", batch_output.shape().dims);
351
352 // Verify batch consistency - first sample should match single inference
353 let _first_batch_sample = batch_output.view(vec![3, 3]); // Reshape to access first sample
354 let first_sample_data = &batch_output.data()[0..3]; // First 3 elements
355 let single_sample_data = single_output.data();
356
357 println!("Consistency check:");
358 println!(" Single output: {:?}", single_sample_data);
359 println!(" First batch sample: {:?}", first_sample_data);
360 println!(
361 " Match: {}",
362 single_sample_data
363 .iter()
364 .zip(first_sample_data.iter())
365 .all(|(a, b)| (a - b).abs() < 1e-6)
366 );
367}
368
369/// Demonstrate serialization and loading
370fn demonstrate_serialization() -> Result<(), Box<dyn std::error::Error>> {
371 println!("\n--- Serialization ---");
372
373 // Create and train a simple layer
374 let mut original_layer = LinearLayer::new(2, 1, Some(47));
375
376 // Simple training data
377 let x_data = Tensor::from_slice(&[1.0, 2.0, 3.0, 4.0], vec![2, 2]).unwrap();
378 let y_true = Tensor::from_slice(&[5.0, 11.0], vec![2, 1]).unwrap();
379
380 let mut optimizer = Adam::with_learning_rate(0.01);
381 let params = original_layer.parameters();
382 for param in ¶ms {
383 optimizer.add_parameter(param);
384 }
385
386 // Train for a few epochs
387 for _ in 0..10 {
388 let y_pred = original_layer.forward(&x_data);
389 let mut loss = (y_pred.sub_tensor(&y_true)).pow_scalar(2.0).mean();
390 loss.backward(None);
391
392 let mut params = original_layer.parameters();
393 optimizer.step(&mut params);
394 optimizer.zero_grad(&mut params);
395 }
396
397 println!("Original layer trained");
398 println!(" Weight: {:?}", original_layer.weight.data());
399 println!(" Bias: {:?}", original_layer.bias.data());
400
401 // Save layer
402 original_layer.save_json("temp_linear_layer")?;
403
404 // Load layer
405 let loaded_layer = LinearLayer::load_json("temp_linear_layer", 2, 1)?;
406
407 println!("Loaded layer");
408 println!(" Weight: {:?}", loaded_layer.weight.data());
409 println!(" Bias: {:?}", loaded_layer.bias.data());
410
411 // Verify consistency
412 let test_input = Tensor::from_slice(&[1.0, 1.0], vec![1, 2]).unwrap();
413 let original_output = original_layer.forward_no_grad(&test_input);
414 let loaded_output = loaded_layer.forward_no_grad(&test_input);
415
416 println!("Consistency check:");
417 println!(" Original output: {:?}", original_output.data());
418 println!(" Loaded output: {:?}", loaded_output.data());
419 println!(
420 " Match: {}",
421 original_output
422 .data()
423 .iter()
424 .zip(loaded_output.data().iter())
425 .all(|(a, b)| (a - b).abs() < 1e-6)
426 );
427
428 println!("Serialization verification: PASSED");
429
430 Ok(())
431}Sourcepub fn add_parameters(&mut self, parameters: &[&Tensor])
pub fn add_parameters(&mut self, parameters: &[&Tensor])
Add multiple parameters to the optimizer
Links multiple parameters to the optimizer by creating parameter states
indexed by each tensor’s ID. All parameters must have requires_grad set to true.
§Arguments
parameters- Slice of references to tensors to link
§Panics
Panics if any parameter does not have requires_grad set to true
Sourcepub fn unlink_parameter(&mut self, parameter: &Tensor) -> bool
pub fn unlink_parameter(&mut self, parameter: &Tensor) -> bool
Sourcepub fn clear_states(&mut self)
pub fn clear_states(&mut self)
Remove all parameter states from the optimizer
Clears all parameter states, effectively unlinking all parameters. This is useful for resetting the optimizer or preparing for parameter re-linking.
Sourcepub fn is_parameter_linked(&self, parameter: &Tensor) -> bool
pub fn is_parameter_linked(&self, parameter: &Tensor) -> bool
Sourcepub fn parameter_count(&self) -> usize
pub fn parameter_count(&self) -> usize
Get the number of linked parameters
Returns the count of parameters currently linked to the optimizer.
§Returns
Number of linked parameters
Examples found in repository?
47fn demonstrate_basic_optimizer_setup() {
48 println!("--- Basic Optimizer Setup ---");
49
50 // Create parameters that require gradients
51 let weight = Tensor::randn(vec![3, 2], Some(42)).with_requires_grad();
52 let bias = Tensor::zeros(vec![2]).with_requires_grad();
53
54 println!("Created parameters:");
55 println!(
56 " Weight: shape {:?}, requires_grad: {}",
57 weight.shape().dims,
58 weight.requires_grad()
59 );
60 println!(
61 " Bias: shape {:?}, requires_grad: {}",
62 bias.shape().dims,
63 bias.requires_grad()
64 );
65
66 // Create Adam optimizer with default configuration
67 let mut optimizer = Adam::new();
68 println!(
69 "Created Adam optimizer with learning rate: {}",
70 optimizer.learning_rate()
71 );
72
73 // Add parameters to optimizer
74 optimizer.add_parameter(&weight);
75 optimizer.add_parameter(&bias);
76 println!(
77 "Added {} parameters to optimizer",
78 optimizer.parameter_count()
79 );
80
81 // Create optimizer with custom configuration
82 let config = AdamConfig {
83 learning_rate: 0.01,
84 beta1: 0.9,
85 beta2: 0.999,
86 eps: 1e-8,
87 weight_decay: 0.0,
88 amsgrad: false,
89 };
90
91 let mut custom_optimizer = Adam::with_config(config);
92 custom_optimizer.add_parameter(&weight);
93 custom_optimizer.add_parameter(&bias);
94
95 println!(
96 "Created custom optimizer with learning rate: {}",
97 custom_optimizer.learning_rate()
98 );
99
100 // Demonstrate parameter linking
101 println!("Parameter linking completed successfully");
102}More examples
109fn demonstrate_optimizer_serialization() -> Result<(), Box<dyn std::error::Error>> {
110 println!("\n--- Optimizer Serialization ---");
111
112 // Create an optimizer with some parameters
113 let mut weight = Tensor::randn(vec![2, 2], Some(42)).with_requires_grad();
114 let mut bias = Tensor::randn(vec![2], Some(43)).with_requires_grad();
115
116 let config = AdamConfig {
117 learning_rate: 0.001,
118 beta1: 0.9,
119 beta2: 0.999,
120 eps: 1e-8,
121 weight_decay: 0.0,
122 amsgrad: false,
123 };
124
125 let mut optimizer = Adam::with_config(config);
126 optimizer.add_parameter(&weight);
127 optimizer.add_parameter(&bias);
128
129 println!(
130 "Created optimizer with {} parameters",
131 optimizer.parameter_count()
132 );
133 println!("Learning rate: {}", optimizer.learning_rate());
134
135 // Simulate some training steps
136 for _ in 0..3 {
137 let mut loss = weight.sum() + bias.sum();
138 loss.backward(None);
139 optimizer.step(&mut [&mut weight, &mut bias]);
140 optimizer.zero_grad(&mut [&mut weight, &mut bias]);
141 }
142
143 // Save optimizer state
144 let optimizer_path = "temp_optimizer.json";
145 optimizer.save_json(optimizer_path)?;
146 println!("Saved optimizer to: {}", optimizer_path);
147
148 // Load optimizer state
149 let loaded_optimizer = Adam::load_json(optimizer_path)?;
150 println!(
151 "Loaded optimizer with {} parameters",
152 loaded_optimizer.parameter_count()
153 );
154 println!("Learning rate: {}", loaded_optimizer.learning_rate());
155
156 // Verify optimizer state
157 assert_eq!(
158 optimizer.parameter_count(),
159 loaded_optimizer.parameter_count()
160 );
161 assert_eq!(optimizer.learning_rate(), loaded_optimizer.learning_rate());
162 println!("Optimizer serialization verification: PASSED");
163
164 Ok(())
165}Sourcepub fn relink_parameters(
&mut self,
parameters: &[&Tensor],
) -> Result<(), String>
pub fn relink_parameters( &mut self, parameters: &[&Tensor], ) -> Result<(), String>
Re-link parameters to saved optimizer states in chronological order
After deserializing an optimizer, use this method to restore saved parameter states to new tensors. Parameters must be provided in the same chronological order they were originally added to the optimizer. Shape validation ensures parameter compatibility.
§Arguments
parameters- Slice of parameter references in chronological order
§Returns
Result indicating success or failure with detailed error message
§Panics
Panics if any parameter does not have requires_grad set to true
Sourcepub fn config(&self) -> &AdamConfig
pub fn config(&self) -> &AdamConfig
Get the current optimizer configuration
Returns a reference to the current configuration, allowing inspection of all hyperparameters without modification.
§Returns
Reference to the current Adam configuration
Trait Implementations§
Source§impl FromFieldValue for Adam
impl FromFieldValue for Adam
Source§fn from_field_value(
value: FieldValue,
field_name: &str,
) -> SerializationResult<Self>
fn from_field_value( value: FieldValue, field_name: &str, ) -> SerializationResult<Self>
Source§impl Optimizer for Adam
impl Optimizer for Adam
Source§fn step(&mut self, parameters: &mut [&mut Tensor])
fn step(&mut self, parameters: &mut [&mut Tensor])
Perform a single optimization step
Updates all provided parameters based on their accumulated gradients using the Adam algorithm. Each parameter is updated according to the Adam update rule with bias correction and optional AMSGrad variant if enabled. All parameters must be linked to the optimizer before calling this method.
§Arguments
parameters- Mutable slice of parameter references to update
§Thread Safety
This method is thread-safe as it takes mutable references to parameters, ensuring exclusive access during updates.
§Performance
- Uses SIMD optimization (AVX2) when available for 8x vectorization
- Processes parameters in sequence for optimal cache usage
- Maintains per-parameter state for momentum and velocity estimates
§Panics
Panics if any parameter is not linked to the optimizer
Source§fn zero_grad(&mut self, parameters: &mut [&mut Tensor])
fn zero_grad(&mut self, parameters: &mut [&mut Tensor])
Zero out all parameter gradients
Clears accumulated gradients for all provided parameters. This should be called before each backward pass to prevent gradient accumulation across multiple forward/backward passes. Also clears the global autograd gradient map.
§Arguments
parameters- Mutable slice of parameter references to clear gradients for
§Performance
- Efficiently clears gradients using optimized tensor operations
- Clears both per-tensor gradients and global autograd state
- Thread-safe as it takes mutable references to parameters
Source§fn learning_rate(&self) -> f32
fn learning_rate(&self) -> f32
Get the current learning rate
Returns the current learning rate used for parameter updates.
§Returns
Current learning rate as f32
Source§fn set_learning_rate(&mut self, lr: f32)
fn set_learning_rate(&mut self, lr: f32)
Set the learning rate for all parameters
Updates the learning rate for all parameters in the optimizer. This allows dynamic learning rate scheduling during training.
§Arguments
lr- New learning rate value
Source§impl Serializable for Adam
impl Serializable for Adam
Source§fn to_json(&self) -> SerializationResult<String>
fn to_json(&self) -> SerializationResult<String>
Serialize the Adam optimizer to JSON format
This method converts the Adam optimizer into a human-readable JSON string representation that includes all optimizer state, configuration, parameter states, and step counts. The JSON format is suitable for debugging, configuration files, and cross-language interoperability.
§Returns
JSON string representation of the optimizer on success, or SerializationError on failure
§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;
let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);
let json = optimizer.to_json().unwrap();
assert!(!json.is_empty());Source§fn from_json(json: &str) -> SerializationResult<Self>
fn from_json(json: &str) -> SerializationResult<Self>
Deserialize an Adam optimizer from JSON format
This method parses a JSON string and reconstructs an Adam optimizer with all
saved state. Parameters must be re-linked after deserialization using
add_parameter or relink_parameters.
§Arguments
json- JSON string containing serialized optimizer
§Returns
The deserialized optimizer on success, or SerializationError on failure
§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;
let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);
let json = optimizer.to_json().unwrap();
let loaded_optimizer = Adam::from_json(&json).unwrap();
assert_eq!(loaded_optimizer.saved_parameter_count(), 1);Source§fn to_binary(&self) -> SerializationResult<Vec<u8>>
fn to_binary(&self) -> SerializationResult<Vec<u8>>
Serialize the Adam optimizer to binary format
This method converts the optimizer into a compact binary representation optimized for storage and transmission. The binary format provides maximum performance and minimal file sizes compared to JSON.
§Returns
Binary representation of the optimizer on success, or SerializationError on failure
§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;
let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);
let binary = optimizer.to_binary().unwrap();
assert!(!binary.is_empty());Source§fn from_binary(data: &[u8]) -> SerializationResult<Self>
fn from_binary(data: &[u8]) -> SerializationResult<Self>
Deserialize an Adam optimizer from binary format
This method parses binary data and reconstructs an Adam optimizer with all
saved state. Parameters must be re-linked after deserialization using
add_parameter or relink_parameters.
§Arguments
data- Binary data containing serialized optimizer
§Returns
The deserialized optimizer on success, or SerializationError on failure
§Examples
use train_station::{Tensor, optimizers::Adam};
use train_station::serialization::Serializable;
let weight = Tensor::ones(vec![2, 3]).with_requires_grad();
let mut optimizer = Adam::new();
optimizer.add_parameter(&weight);
let binary = optimizer.to_binary().unwrap();
let loaded_optimizer = Adam::from_binary(&binary).unwrap();
assert_eq!(loaded_optimizer.saved_parameter_count(), 1);Source§fn save<P: AsRef<Path>>(
&self,
path: P,
format: Format,
) -> SerializationResult<()>
fn save<P: AsRef<Path>>( &self, path: P, format: Format, ) -> SerializationResult<()>
Source§fn save_to_writer<W: Write>(
&self,
writer: &mut W,
format: Format,
) -> SerializationResult<()>
fn save_to_writer<W: Write>( &self, writer: &mut W, format: Format, ) -> SerializationResult<()>
Source§fn load<P: AsRef<Path>>(path: P, format: Format) -> SerializationResult<Self>
fn load<P: AsRef<Path>>(path: P, format: Format) -> SerializationResult<Self>
Source§fn load_from_reader<R: Read>(
reader: &mut R,
format: Format,
) -> SerializationResult<Self>
fn load_from_reader<R: Read>( reader: &mut R, format: Format, ) -> SerializationResult<Self>
Source§impl StructSerializable for Adam
impl StructSerializable for Adam
Source§fn to_serializer(&self) -> StructSerializer
fn to_serializer(&self) -> StructSerializer
Convert Adam to StructSerializer for serialization
Serializes all optimizer state including configuration, parameter states, and global step count. Parameter linking is not serialized and must be done after deserialization.
§Returns
StructSerializer containing all serializable optimizer state
Source§fn from_deserializer(
deserializer: &mut StructDeserializer,
) -> SerializationResult<Self>
fn from_deserializer( deserializer: &mut StructDeserializer, ) -> SerializationResult<Self>
Create Adam from StructDeserializer
Reconstructs Adam optimizer from serialized state. Parameters must be
linked separately using add_parameter or add_parameters.
§Arguments
deserializer- StructDeserializer containing optimizer data
§Returns
Reconstructed Adam instance without parameter links, or error if deserialization fails