Expand description
Context window management — smart truncation and token counting.
The #1 engineering challenge for agents. This module provides:
- Token estimation (fast, no external deps)
- Tiered compaction (tool output truncation → turn summarization → full summary)
- Execution limits (max turns, tokens, duration)
Designed based on Claude Code’s approach: clear old tool outputs first, then summarize conversation if needed.
Structs§
- Context
Config - Configuration for context management
- Context
Tracker - Tracks context size using real token counts from provider responses combined with estimates for messages added after the last response.
- Default
Compaction - Default 3-level compaction: truncate tool outputs → summarize turns → drop middle.
- Execution
Limits - Execution limits for the agent loop
- Execution
Tracker - Tracks execution state against limits
Traits§
- Compaction
Strategy - Strategy for compacting messages when context exceeds budget.
Functions§
- compact_
messages - Compact messages to fit within the token budget using tiered strategy.
- estimate_
tokens - Rough token estimate: ~4 chars per token for English text. Good enough for context budgeting. Use tiktoken-rs for precision.
- message_
tokens - Estimate tokens for a single message
- total_
tokens - Estimate total tokens for a message list