1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
use ;
use crate*;
// ---------------------------------------------------------------------------
// Context tracking (real usage + estimates)
// ---------------------------------------------------------------------------
/// Tracks context size using real token counts from provider responses
/// combined with estimates for messages added after the last response.
///
/// This gives more accurate context size tracking than pure estimation,
/// since providers report actual token counts in their usage data.
///
/// # Example
///
/// ```rust
/// use phi_core::context::ContextTracker;
/// use phi_core::types::Usage;
///
/// let mut tracker = ContextTracker::new();
/// // After receiving an assistant response with usage data:
/// tracker.record_usage(&Usage { input: 1500, output: 200, ..Default::default() }, 3);
/// ```
/*
RUST QUIRK: Using `Option<usize>` for "not yet known" state
`last_usage_tokens: Option<usize>` means "either we have a real token count
(Some(n)), or we haven't received one yet (None)".
This is Rust's way of representing nullable data without null pointers.
There is no `null` or `None` in Rust — you must use `Option<T>` explicitly.
The compiler forces you to handle both cases, preventing null pointer exceptions.
Python analogy: last_usage_tokens: Optional[int] = None
The hybrid design strategy:
- After each LLM response, record the REAL token count from provider usage data
- For messages added after the last response, ESTIMATE with chars/4
- Combine: real_base + estimated_trailing = accurate context size
This beats pure estimation because real token counts account for:
- Unicode characters (multi-byte)
- Special tokens (BOS, EOS, system prompt formatting)
- Provider-specific tokenization differences
*/