Module compact

Expand description

Auto-compact: context window management for long conversations.

When the conversation approaches the context window limit, older messages are summarized to free space while preserving essential context.

Structs§

AUTOCOMPACT_TRIGGER_FRACTION: Fraction of context window that triggers auto-compact.
CRITICAL_PCT: Critical threshold (95% of context window).
KEEP_RECENT_MESSAGES: Number of recent messages to always preserve (never compacted).
MAX_CONSECUTIVE_FAILURES: Max consecutive failures before disabling auto-compact.
WARNING_PCT: Warning threshold (80% of context window).

auto_compact_if_needed: Check and run auto-compact if needed. Returns None if no compaction needed.
calculate_messages_to_keep_index: Calculate how many messages to keep given a token budget.
calculate_token_warning_state: Calculate the token warning state given current usage.
collapse_read_tool_results: Collapse repeated file read results: if the same file is read multiple times, only keep the latest result.
compact_conversation: Compact the conversation by summarizing older messages.
context_window_for_model: Get context window size for a model.
estimate_messages_tokens: Estimate tokens for a list of messages.
estimate_tokens: Rough token estimate for a message (~4 chars per token).
format_compact_summary: Format raw compact output into a summary message.
get_compact_prompt: Build the compaction prompt for the LLM.
group_messages_for_compact: Group messages into semantically coherent chunks at API-round boundaries. Each group = one assistant response + its tool results.
should_auto_compact: Check if auto-compact should run (considering state/circuit breaker).
should_compact: Check if compaction should trigger.
should_context_collapse: Check if context collapse is needed (emergency, >98%).
snip_compact: Remove oldest messages, keeping only the newest keep_n. Returns (remaining messages, estimated tokens freed).