Expand description
Multi-protocol request router (plan #32).
Borrowed from MAX’s serve/router/{openai_routes, kserve_routes, sagemaker_routes, openresponses_routes}.py. The shape: one
inference engine, multiple wire protocols layered as thin
per-protocol adapters. Adding a new protocol = implementing
WireProtocol for its raw request type, not editing the hot
path.
Today’s adapter is OpenAI-shaped (chat completions +
embeddings) since crate::mock_requests already defines
those structs. KServe / SageMaker / OpenResponses slot in by
impl’ing WireProtocol for their respective request types.
All conversion is pure-data: no I/O, no async. The actual HTTP
parsing happens upstream (in the future serving crate); this
module owns the translation between wire types and the
internal RoutedRequest.
Structs§
- OpenAI
Protocol - OpenAI-style adapter — handles ChatCompletionRequest and
EmbeddingRequest from
crate::mock_requests. - Routed
Request - Internal canonical request shape. Every wire protocol parses into this; downstream schedulers / engines consume only this type.
Enums§
- OpenAI
Request - Request
Kind - What kind of inference the request is asking for. Drives which downstream pipeline serves it (text-gen vs embedding pool, etc.).
- Route
Error
Traits§
- Wire
Protocol - Adapter trait — implement once per wire protocol.