Expand description
Context compaction module.
Handles automatic context compaction when the conversation gets too long. This includes token threshold detection, summary generation, and message management.
Re-exports§
pub use crate::services::token_estimation::rough_token_count_estimation;pub use crate::services::token_estimation::rough_token_count_estimation_for_content;pub use crate::services::token_estimation::rough_token_count_estimation_for_message;
Modules§
- compact_
errors - Compact command error messages
Structs§
- Compact
Command - Compact command configuration
- Compaction
Result - Compact result containing the new messages after compaction
- File
Read State - Post-compact restore state — tracks recently accessed files for restoration
- Post
Compact Restore - Post-compact file restore result
- Token
Warning State - Calculate token warning state Translated from: calculateTokenWarningState in autoCompact.ts
Constants§
- AUTOCOMPACT_
BUFFER_ TOKENS - Buffer tokens for auto-compact trigger
- DEFAULT_
CONTEXT_ WINDOW - Default context window sizes by model (in tokens)
- ERROR_
THRESHOLD_ BUFFER_ TOKENS - Buffer tokens for error threshold
- MANUAL_
COMPACT_ BUFFER_ TOKENS - Manual compact uses smaller buffer (more aggressive)
- MAX_
CONSECUTIVE_ AUTOCOMPACT_ FAILURES - Maximum consecutive auto-compact failures before giving up
- MAX_
OUTPUT_ TOKENS_ FOR_ SUMMARY - Reserve tokens for output during compaction Based on p99.99 of compact summary output
- POST_
COMPACT_ MAX_ FILES_ TO_ RESTORE - Post-compaction: max files to restore
- POST_
COMPACT_ MAX_ TOKENS_ PER_ FILE - Post-compaction: max tokens per file
- POST_
COMPACT_ MAX_ TOKENS_ PER_ SKILL - Post-compaction: max tokens per skill
- POST_
COMPACT_ SKILLS_ TOKEN_ BUDGET - Post-compaction: skills token budget
- POST_
COMPACT_ TOKEN_ BUDGET - Post-compaction: token budget for restored files
- SKILL_
TRUNCATION_ MARKER - SKILL_TRUNCATION_MARKER appended when a skill is truncated for post-compact restore.
- WARNING_
THRESHOLD_ BUFFER_ TOKENS - Buffer tokens for warning threshold
Functions§
- calculate_
token_ warning_ state - collect_
read_ tool_ file_ paths - Collect file paths from Read tool results in preserved messages. Returns paths that are already visible and don’t need restoration.
- create_
post_ compact_ file_ attachments - Create post-compact file restore attachments.
- create_
post_ compact_ skill_ attachments - Create post-compact skill restore attachments.
- estimate_
token_ count - Estimate token count for messages (rough estimation) Uses 4 chars per token for regular text (matching original TypeScript) Uses 2 chars per token for tool results (JSON is more token-efficient) Takes optional max_output_tokens to ensure we leave room for the response
- get_
auto_ compact_ threshold - Get the auto-compact threshold (when to trigger compaction)
- get_
blocking_ limit - Get the blocking limit (when to block further input)
- get_
compact_ command - Get the compact command
- get_
compact_ prompt - Get the prompt for generating conversation summary Translated from: getCompactPrompt in prompt.ts
- get_
context_ window_ for_ model - Get context window size for a model
- get_
default_ context_ window - Get default context window from environment or use default
- get_
effective_ context_ window_ size - Get effective context window size (total - output reserve) TS: autoCompact.ts getEffectiveContextWindowSize
- should_
compact - Check if conversation should be compacted
- strip_
images_ from_ messages - Strip images from messages before sending for compaction
Images are replaced with
[image]text markers, documents with[document]markers to prevent compaction API from hitting prompt-too-long - strip_
reinjected_ attachments - Strip reinjected attachments (skill_discovery/skill_listing) that will be re-injected post-compaction anyway
- truncate_
messages_ for_ summary - Truncate messages to fit within a safe token limit for summarization This is used when the conversation is too large to fit in context Skips ALL system messages (they contain huge compaction summaries) Returns (truncated_messages, estimated_tokens)
- truncate_
to_ tokens - Truncate content to roughly max_tokens, keeping the head. rough_token_count_estimation uses ~4 chars/token, so char budget = max_tokens * 4.