Skip to main content

Module request

ruvllm::serving

Module request

Expand description

Request types for the continuous batching serving engine

This module defines the core request structures used throughout the serving system, including inference requests, running requests, and completed requests.

Structs§

CompletedRequest: Result of a completed request
InferenceRequest: An incoming inference request
RequestId: Unique identifier for a request
RunningRequest: A request that is currently being processed
TokenOutput: Output from a single token generation step

Enums§

FinishReason: Reason for request completion
Priority: Priority level for request scheduling
RequestState: State of a request in the serving pipeline