1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
//! Mixed precision configuration for intermediate calculations.
/// Compute precision for intermediate calculations with reduced-precision types.
///
/// When operating on reduced-precision types (F16, BF16, FP8), values are typically
/// converted to a higher precision format for computation, then converted back.
/// This allows trading off speed vs precision.
///
/// # Precision Comparison
///
/// | Precision | Decimal Digits | Speed | Use Case |
/// |-----------|----------------|---------|----------|
/// | **F64** | ~15-16 | Slowest | Scientific computing requiring maximum precision |
/// | **F32** | ~7 | Medium | High-precision ML, when BF16 isn't enough |
/// | **BF16** | ~3 | Fastest | ML training/inference (default, industry standard) |
///
/// # Applicability
///
/// - **FP8**: Always needs upcasting (8-bit storage, compute in BF16, F32, or F64)
/// - **F16/BF16**: Can optionally upcast to F32/F64 for higher precision
/// - **F32**: Can upcast to F64 for scientific computing
/// - **F64**: No upcasting needed (already highest precision)
///
/// # Resolution Order
///
/// `per-operation > tensor-level > client default`
///
/// # Default
///
/// BF16 is the default, as it provides good speed with the same dynamic range as F32.
/// This is the industry standard for mixed-precision ML training.