Skip to main content

Module router

Module router 

Source
Expand description

Multi-protocol request router (plan #32).

Borrowed from MAX’s serve/router/{openai_routes, kserve_routes, sagemaker_routes, openresponses_routes}.py. The shape: one inference engine, multiple wire protocols layered as thin per-protocol adapters. Adding a new protocol = implementing WireProtocol for its raw request type, not editing the hot path.

Today’s adapter is OpenAI-shaped (chat completions + embeddings) since crate::mock_requests already defines those structs. KServe / SageMaker / OpenResponses slot in by impl’ing WireProtocol for their respective request types.

All conversion is pure-data: no I/O, no async. The actual HTTP parsing happens upstream (in the future serving crate); this module owns the translation between wire types and the internal RoutedRequest.

Structs§

OpenAIProtocol
OpenAI-style adapter — handles ChatCompletionRequest and EmbeddingRequest from crate::mock_requests.
RoutedRequest
Internal canonical request shape. Every wire protocol parses into this; downstream schedulers / engines consume only this type.

Enums§

OpenAIRequest
RequestKind
What kind of inference the request is asking for. Drives which downstream pipeline serves it (text-gen vs embedding pool, etc.).
RouteError

Traits§

WireProtocol
Adapter trait — implement once per wire protocol.