Module streaming_context

Expand description

Streaming Context Generation (Task 1)

This module implements reactive context streaming for reduced TTFT (time-to-first-token) and progressive budget enforcement.

§Design

Instead of materializing all sections before returning, execute_streaming() returns a Stream<Item = SectionChunk> that yields chunks as they become ready.

Priority Queue         Stream Output
┌─────────────┐       ┌───────────────┐
│ P0: USER    │──────►│ SectionHeader │
│ P1: HISTORY │       │ RowBlock      │
│ P2: SEARCH  │       │ RowBlock      │
└─────────────┘       │ SearchResult  │
                      │ ...           │
                      └───────────────┘

§Budget Enforcement

Rolling sum is maintained: B = Σ tokens(chunk_i) Stream terminates when B ≥ token_limit.

§Complexity

Scheduling: O(log S) per section where S = number of sections
Budget tracking: O(m) where m = total chunks
Tokenization: depends on exact vs estimated mode

Structs§

RollingBudget: Thread-safe rolling budget tracker for streaming
StreamingConfig: Configuration for streaming context generation
StreamingContextExecutor: Streaming context executor
StreamingContextIter: Iterator over streaming context chunks
StreamingSearchResult: Streaming search result (subset of VectorSearchResult)

Enums§

SectionChunk: A chunk of context output during streaming

Functions§

collect_streaming_chunks: Collect all chunks from a streaming context execution
create_streaming_executor: Create a streaming context executor with default configuration
materialize_context: Materialize streaming chunks into a final context string

Module streaming_context

Module streaming_context Copy item path

§Design

§Budget Enforcement

§Complexity

Structs§

Enums§

Functions§

Module streaming_context