Expand description
Performance recording and analysis for streaming LLM responses
This module provides mechanisms to record streaming responses with minimal overhead during collection, then analyze the recorded data for performance insights.
Modules§
- logprobs
- Module for recording logprobs from a streaming response.
Structs§
- Recorded
Stream - Container for recorded streaming responses. This forms the core object on which analysis is performed.
- Recording
Stream - Recording stream that wraps an AsyncEngineStream and records responses Following the pattern of ResponseStream for AsyncEngine compatibility
- Timestamped
Response - A response wrapper that adds timing information with minimal overhead
Enums§
- Recording
Mode - Recording mode determines how the recorder behaves with the stream
Traits§
- Capacity
Hint - Trait for requests that can provide hints about expected response count This enables capacity pre-allocation for better performance
Functions§
- record_
response_ stream - Create a recording stream from ResponseStream (convenience wrapper)
- record_
stream - Create a recording stream that wraps an AsyncEngineStream Returns a pinned stream and a receiver for the recorded data
- record_
stream_ with_ capacity - Create a recording stream with capacity hint
- record_
stream_ with_ context - Create a recording stream from a raw stream and context Returns a pinned stream and a receiver for the recorded data
- record_
stream_ with_ context_ and_ capacity - Create a recording stream from a raw stream and context with capacity hint
- record_
stream_ with_ request_ hint - Create a recording stream with capacity hint from request
Type Aliases§
- Recorded
Stream Receiver - Type alias for a receiver of recorded stream data
- Recording
Result - Type alias for the return type of recording functions