1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
//! Phase-based training with prediction and correction cycles.
//!
//! Full gradient computation is expensive. By alternating between:
//! 1. Full training phases (accurate gradients)
//! 2. Predicted phases (fast, approximate gradients)
//! 3. Correction phases (fix accumulated errors)
//!
//! We can achieve similar convergence with significantly reduced compute.
//!
//! # Training Modes
//!
//! ## Deterministic Mode (Recommended)
//!
//! Uses [`DeterministicPhaseTrainer`] with weighted least-squares gradient model.
//! Guarantees reproducible predictions and includes residual tracking.
//!
//! ```text
//! WARMUP (W steps) → FULL (N steps) → PREDICT (M steps) → CORRECT → repeat
//! ```
//!
//! ## Legacy Mode
//!
//! Uses [`PhaseTrainer`] with momentum-based extrapolation. Faster but less
//! accurate and not fully deterministic.
//!
//! # Determinism Guarantees
//!
//! With `DeterministicPhaseTrainer`:
//! - Same seed + same data = identical training trajectory
//! - No stochastic operations in prediction
//! - Residuals ensure convergence to actual gradients
pub use ;
pub use ;