Module engines

Module engines

Structs§

EmbeddingEngineAdapter
EngineDispatcher: Engine that dispatches requests to either OpenAICompletions or OpenAIChatCompletions engine
MultiNodeConfig
StreamingEngineAdapter
ValidateEngine: Validate Engine that verifies request data

Statics§

TOKEN_ECHO_DELAY: How long to sleep between echoed tokens. Default is 10ms which gives us 100 tok/s. Can be configured via the DYN_TOKEN_ECHO_DELAY_MS environment variable.

Traits§

EmbeddingEngine: Trait that allows handling embedding requests
StreamingEngine: Trait that allows handling both completion and chat completions requests
ValidateRequest: Trait on request types that allows us to validate the data

Functions§

make_echo_engine