Module streaming_context

Module streaming_context 

Source
Expand description

Streaming Context Generation (Task 1)

This module implements reactive context streaming for reduced TTFT (time-to-first-token) and progressive budget enforcement.

§Design

Instead of materializing all sections before returning, execute_streaming() returns a Stream<Item = SectionChunk> that yields chunks as they become ready.

Priority Queue         Stream Output
┌─────────────┐       ┌───────────────┐
│ P0: USER    │──────►│ SectionHeader │
│ P1: HISTORY │       │ RowBlock      │
│ P2: SEARCH  │       │ RowBlock      │
└─────────────┘       │ SearchResult  │
                      │ ...           │
                      └───────────────┘

§Budget Enforcement

Rolling sum is maintained: B = Σ tokens(chunk_i) Stream terminates when B ≥ token_limit.

§Complexity

  • Scheduling: O(log S) per section where S = number of sections
  • Budget tracking: O(m) where m = total chunks
  • Tokenization: depends on exact vs estimated mode

Structs§

RollingBudget
Thread-safe rolling budget tracker for streaming
StreamingConfig
Configuration for streaming context generation
StreamingContextExecutor
Streaming context executor
StreamingContextIter
Iterator over streaming context chunks
StreamingSearchResult
Streaming search result (subset of VectorSearchResult)

Enums§

SectionChunk
A chunk of context output during streaming

Functions§

collect_streaming_chunks
Collect all chunks from a streaming context execution
create_streaming_executor
Create a streaming context executor with default configuration
materialize_context
Materialize streaming chunks into a final context string