Structs§
- Embedding
Engine Adapter - Engine
Dispatcher - Engine that dispatches requests to either OpenAICompletions or OpenAIChatCompletions engine
- Multi
Node Config - Streaming
Engine Adapter - Validate
Engine - Validate Engine that verifies request data
Statics§
- TOKEN_
ECHO_ DELAY - How long to sleep between echoed tokens. Default is 10ms which gives us 100 tok/s. Can be configured via the DYN_TOKEN_ECHO_DELAY_MS environment variable.
Traits§
- Embedding
Engine - Trait that allows handling embedding requests
- Streaming
Engine - Trait that allows handling both completion and chat completions requests
- Validate
Request - Trait on request types that allows us to validate the data