pub struct TokenBudget {Show 14 fields
pub max_context_tokens: u32,
pub max_output_tokens: u32,
pub strategy: BudgetStrategy,
pub safety_margin: u32,
pub compression_trigger_percent: u8,
pub compression_target_percent: u8,
pub working_reserve_tokens: u32,
pub fallback_trigger_percent: u8,
pub prompt_cache_min_tool_output_chars: u32,
pub prompt_cache_head_chars: u32,
pub prompt_cache_tail_chars: u32,
pub prompt_cache_recent_user_turns: u8,
pub prompt_cache_recent_tool_chains: u8,
pub max_tool_output_tokens: u32,
}Expand description
Token budget configuration for a conversation.
Fields§
§max_context_tokens: u32Maximum context window size for the model (input + output)
max_output_tokens: u32Maximum tokens reserved for model output
strategy: BudgetStrategyBudget enforcement strategy
safety_margin: u32Safety margin for tokenizer estimation errors
compression_trigger_percent: u8Proactive compression trigger threshold as a percentage of context window tokens.
Legacy field — only used when working_reserve_tokens == 0.
compression_target_percent: u8Compression target threshold as a percentage of context window tokens.
working_reserve_tokens: u32Fixed number of tokens reserved for model reasoning and output.
Compression triggers when (max_context_tokens - working_reserve_tokens) is
exceeded. This provides consistent reasoning space regardless of model context
window size (Claude Code uses ~50K).
When working_reserve_tokens == 0, falls back to percentage-based triggering
via compression_trigger_percent for backward compatibility.
fallback_trigger_percent: u8Fallback percentage trigger when the context window is too small for the fixed
reserve (i.e. max_context_tokens < working_reserve_tokens * 2).
prompt_cache_min_tool_output_chars: u32Minimum tool output character length required before prompt-side cache compaction.
prompt_cache_head_chars: u32Leading excerpt length (chars) kept in cached tool output summaries.
prompt_cache_tail_chars: u32Trailing excerpt length (chars) kept in cached tool output summaries.
prompt_cache_recent_user_turns: u8Number of latest user turns protected from prompt-side cache compaction.
prompt_cache_recent_tool_chains: u8Number of latest tool call chains protected from prompt-side cache compaction.
max_tool_output_tokens: u32Maximum tokens allowed per single tool output. 0 = use byte-based fallback only.
Implementations§
Source§impl TokenBudget
impl TokenBudget
pub fn new( max_context_tokens: u32, max_output_tokens: u32, strategy: BudgetStrategy, ) -> TokenBudget
pub fn with_safety_margin( max_context_tokens: u32, max_output_tokens: u32, strategy: BudgetStrategy, safety_margin: u32, ) -> TokenBudget
pub fn compression_trigger_context_tokens(&self) -> u32
pub fn compression_target_context_tokens(&self) -> u32
pub fn for_model(max_context_tokens: u32) -> TokenBudget
Trait Implementations§
Source§impl Clone for TokenBudget
impl Clone for TokenBudget
Source§fn clone(&self) -> TokenBudget
fn clone(&self) -> TokenBudget
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more