pub enum BatchRequest {
Generate {
prompt: String,
max_tokens: usize,
config: SamplerConfig,
cache_prompt: bool,
lora_selection: LoraSelection,
reply: Sender<Result<(String, UsageStats), String>>,
},
GenerateStream {
prompt: String,
max_tokens: usize,
config: SamplerConfig,
cache_prompt: bool,
lora_selection: LoraSelection,
callback: StreamCallback,
reply: Sender<Result<UsageStats, String>>,
},
Embed {
text: String,
reply: Sender<Result<Vec<f32>, String>>,
},
}Expand description
A single inference request dispatched to the worker task.
Variants§
Generate
Non-streaming generation: prompt → full response string.
Fields
config: SamplerConfigPer-request sampler configuration.
cache_prompt: boolWhether to look up and store the prompt’s KV state in the prefix
cache. When true (default), the worker checks for a matching
cached prefix and skips the redundant prefill if found.
lora_selection: LoraSelectionLoRA adapters to apply for this request. Empty means no LoRA.
GenerateStream
Streaming generation: invokes callback for every decoded token.
Fields
config: SamplerConfigPer-request sampler configuration.
lora_selection: LoraSelectionLoRA adapters to apply for this request. Empty means no LoRA.
callback: StreamCallbackCalled with each token text inside the blocking worker thread.
reply: Sender<Result<UsageStats, String>>Channel that receives Ok(UsageStats) once generation is complete, or
Err(message) on failure.
Embed
Embedding computation: text → L2-normalised vector.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for BatchRequest
impl !RefUnwindSafe for BatchRequest
impl Send for BatchRequest
impl !Sync for BatchRequest
impl Unpin for BatchRequest
impl UnsafeUnpin for BatchRequest
impl !UnwindSafe for BatchRequest
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more