Expand description
Auto-compact: context window management for long conversations.
When the conversation approaches the context window limit, older messages are summarized to free space while preserving essential context.
Structs§
- Auto
Compact State - Session-level compaction tracking.
- Compact
Result - Result of a compaction operation.
- Message
Group - A semantically coherent group of messages for summarization.
Enums§
- Compact
Trigger - What triggered the compaction.
- Token
Warning State - Context window fullness level.
Constants§
- AUTOCOMPACT_
TRIGGER_ FRACTION - Fraction of context window that triggers auto-compact.
- CRITICAL_
PCT - Critical threshold (95% of context window).
- KEEP_
RECENT_ MESSAGES - Number of recent messages to always preserve (never compacted).
- MAX_
CONSECUTIVE_ FAILURES - Max consecutive failures before disabling auto-compact.
- WARNING_
PCT - Warning threshold (80% of context window).
Functions§
- auto_
compact_ if_ needed - Check and run auto-compact if needed. Returns None if no compaction needed.
- calculate_
messages_ to_ keep_ index - Calculate how many messages to keep given a token budget.
- calculate_
token_ warning_ state - Calculate the token warning state given current usage.
- collapse_
read_ tool_ results - Collapse repeated file read results: if the same file is read multiple times, only keep the latest result.
- compact_
conversation - Compact the conversation by summarizing older messages.
- context_
window_ for_ model - Get context window size for a model.
- estimate_
messages_ tokens - Estimate tokens for a list of messages.
- estimate_
tokens - Rough token estimate for a message (~4 chars per token).
- format_
compact_ summary - Format raw compact output into a summary message.
- get_
compact_ prompt - Build the compaction prompt for the LLM.
- group_
messages_ for_ compact - Group messages into semantically coherent chunks at API-round boundaries. Each group = one assistant response + its tool results.
- should_
auto_ compact - Check if auto-compact should run (considering state/circuit breaker).
- should_
compact - Check if compaction should trigger.
- should_
context_ collapse - Check if context collapse is needed (emergency, >98%).
- snip_
compact - Remove oldest messages, keeping only the newest
keep_n. Returns (remaining messages, estimated tokens freed).