pub fn needs_compression(
conv: &Conversation,
system_prompt_tokens: usize,
token_budget: usize,
) -> boolExpand description
Check if context needs compression.
Threshold derived from auto_compact_threshold — fires when fewer
than buffer tokens remain (5K for ≤100K windows, 13K for >100K).
Buffer scales with the deployment: self-hosted GLM at 65K trips
at 60K (4K runway is plenty for one round); Anthropic at 200K
trips at 187K, matching CC’s behaviour.
The messages.len() < 12 guard stays — needs a non-trivial backlog
before compression is worthwhile, and 1 user msg can produce 15+
messages so message count is the right unit.