pub struct ContextWindow { /* private fields */ }Expand description
A token-budgeted message buffer for managing conversation context.
Tracks messages with their token counts and provides compaction signals when the context approaches capacity.
Implementations§
Source§impl ContextWindow
impl ContextWindow
Sourcepub fn push(&mut self, message: ChatMessage, tokens: u32)
pub fn push(&mut self, message: ChatMessage, tokens: u32)
Adds a message with its token count.
New messages are compactable by default. Use protect_recent
to mark recent messages as non-compactable.
§Arguments
message- The chat message to addtokens- Token count for this message (from provider usage or estimation)
Sourcepub fn available(&self) -> u32
pub fn available(&self) -> u32
Returns the number of tokens available for new content.
This is max_tokens - reserved_for_output - total_tokens().
Sourcepub fn iter(&self) -> impl Iterator<Item = &ChatMessage>
pub fn iter(&self) -> impl Iterator<Item = &ChatMessage>
Returns an iterator over the current messages.
Prefer this over messages to avoid allocation.
Sourcepub fn messages(&self) -> Vec<&ChatMessage>
pub fn messages(&self) -> Vec<&ChatMessage>
Returns the current messages as a vector of references.
For iteration without allocation, use iter instead.
Sourcepub fn messages_owned(&self) -> Vec<ChatMessage>
pub fn messages_owned(&self) -> Vec<ChatMessage>
Returns owned copies of the current messages.
Use this when you need to pass messages to a provider that takes ownership.
Sourcepub fn total_tokens(&self) -> u32
pub fn total_tokens(&self) -> u32
Returns the total tokens currently in the window.
Sourcepub fn needs_compaction(&self, threshold: f32) -> bool
pub fn needs_compaction(&self, threshold: f32) -> bool
Checks if compaction is needed based on a threshold.
Returns true if the window is more than threshold percent full.
§Arguments
threshold- A value between 0.0 and 1.0 (e.g., 0.8 for 80%)
§Example
use llm_stack_core::context::ContextWindow;
use llm_stack_core::ChatMessage;
let mut window = ContextWindow::new(1000, 200);
window.push(ChatMessage::user("Hello"), 700);
// 700 / (1000 - 200) = 87.5% full
assert!(window.needs_compaction(0.8));
assert!(!window.needs_compaction(0.9));Sourcepub fn compact(&mut self) -> Vec<ChatMessage>
pub fn compact(&mut self) -> Vec<ChatMessage>
Removes and returns compactable messages.
Messages marked as non-compactable (via protect_recent
or system messages) are retained. Returns the removed messages so the
caller can summarize them.
§Returns
A vector of removed messages, in their original order.
Sourcepub fn protect_recent(&mut self, n: usize)
pub fn protect_recent(&mut self, n: usize)
Marks the most recent n messages as non-compactable.
This protects recent context from being removed during compaction. Call this after adding messages that should be preserved.
§Arguments
n- Number of recent messages to protect (from the end). Ifnexceeds the window length, all messages are protected.
Sourcepub fn protect(&mut self, index: usize)
pub fn protect(&mut self, index: usize)
Marks a message at the given index as non-compactable.
Useful for protecting specific messages like system prompts.
§Panics
Panics if index >= len().
Sourcepub fn is_protected(&self, index: usize) -> bool
pub fn is_protected(&self, index: usize) -> bool
Returns whether the message at index is protected from compaction.
§Panics
Panics if index >= len().
Sourcepub fn input_budget(&self) -> u32
pub fn input_budget(&self) -> u32
Returns the input budget (max_tokens - reserved_for_output).
Sourcepub fn max_tokens(&self) -> u32
pub fn max_tokens(&self) -> u32
Returns the maximum tokens this window was configured with.
Sourcepub fn reserved_for_output(&self) -> u32
pub fn reserved_for_output(&self) -> u32
Returns the tokens reserved for output.
Sourcepub fn token_count(&self, index: usize) -> u32
pub fn token_count(&self, index: usize) -> u32
Sourcepub fn update_token_count(&mut self, index: usize, tokens: u32)
pub fn update_token_count(&mut self, index: usize, tokens: u32)
Updates the token count for the message at the given index.
Useful when you get accurate token counts from the provider after initially using estimates.
§Panics
Panics if index >= len().