1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
//! Checkpoint management
use TrainingMetrics;
use ;
use Uuid;
use ;
/// Checkpoint of training state.
///
/// # Serialization Format
///
/// When serialized with the `serde` feature, checkpoints use JSON format with the following structure:
///
/// ```json
/// {
/// "id": "550e8400-e29b-41d4-a716-446655440000",
/// "epoch": 10,
/// "global_step": 5000,
/// "metrics": {
/// "final_train_loss": 0.05,
/// "final_val_loss": 0.06,
/// "epochs_completed": 10,
/// "total_steps": 5000,
/// "best_epoch": 8,
/// "best_val_loss": 0.055
/// },
/// "created_at": "2024-01-15T10:30:00Z",
/// "weights": "<base64-encoded bytes>",
/// "optimizer_state": "<base64-encoded bytes>"
/// }
/// ```
///
/// ## Field Details
///
/// | Field | Type | Description |
/// |-------|------|-------------|
/// | `id` | UUID v4 | Unique identifier for this checkpoint |
/// | `epoch` | usize | Training epoch when created (0-indexed) |
/// | `global_step` | usize | Total batches processed |
/// | `metrics` | TrainingMetrics | Training statistics at checkpoint time |
/// | `created_at` | ISO 8601 | UTC timestamp of creation |
/// | `weights` | bytes | Serialized model parameters |
/// | `optimizer_state` | bytes | Serialized optimizer momentum/state |
///
/// ## Weight Serialization
///
/// The `weights` field contains model parameters serialized as:
/// 1. Little-endian f32 values concatenated
/// 2. Layer order: input-to-output (layer 0 weights, layer 0 biases, layer 1...)
/// 3. Weight matrices in row-major order
///
/// For a network with layers [4->8, 8->2], weights layout:
/// ```text
/// [layer0_weights: 32 floats][layer0_biases: 8 floats][layer1_weights: 16 floats][layer1_biases: 2 floats]
/// ```
///
/// ## Optimizer State Serialization
///
/// The `optimizer_state` field contains optimizer-specific state:
///
/// - **SGD with momentum**: Velocity vectors matching weight dimensions
/// - **Adam**: First moment (m) + second moment (v) for each parameter
///
/// ## Loading Checkpoints
///
/// ```ignore
/// use lattice_tune::Checkpoint;
///
/// // Load from JSON file
/// let json = std::fs::read_to_string("checkpoint_epoch_10.json")?;
/// let checkpoint: Checkpoint = serde_json::from_str(&json)?;
///
/// // Restore model weights
/// model.load_weights(&checkpoint.weights);
/// optimizer.load_state(&checkpoint.optimizer_state);
/// ```
///
/// ## Checkpoint Naming Convention
///
/// Recommended file naming: `checkpoint_epoch_{epoch:04d}_step_{step:08d}.json`
///
/// Example: `checkpoint_epoch_0010_step_00005000.json`