CompletionRequest

Struct CompletionRequest 

Source
pub struct CompletionRequest {
Show 34 fields pub model: String, pub prompt: StringOrArray, pub suffix: Option<String>, pub max_tokens: Option<u32>, pub temperature: Option<f32>, pub top_p: Option<f32>, pub n: Option<u32>, pub stream: bool, pub stream_options: Option<StreamOptions>, pub logprobs: Option<u32>, pub echo: bool, pub stop: Option<StringOrArray>, pub presence_penalty: Option<f32>, pub frequency_penalty: Option<f32>, pub best_of: Option<u32>, pub logit_bias: Option<HashMap<String, f32>>, pub user: Option<String>, pub seed: Option<i64>, pub top_k: Option<i32>, pub min_p: Option<f32>, pub min_tokens: Option<u32>, pub repetition_penalty: Option<f32>, pub regex: Option<String>, pub ebnf: Option<String>, pub json_schema: Option<String>, pub stop_token_ids: Option<Vec<u32>>, pub no_stop_trim: bool, pub ignore_eos: bool, pub skip_special_tokens: bool, pub lora_path: Option<String>, pub session_params: Option<HashMap<String, Value>>, pub return_hidden_states: bool, pub sampling_seed: Option<u64>, pub other: Map<String, Value>,
}

Fields§

§model: String

ID of the model to use (required for OpenAI, optional for some implementations, such as SGLang)

§prompt: StringOrArray

The prompt(s) to generate completions for

§suffix: Option<String>

The suffix that comes after a completion of inserted text

§max_tokens: Option<u32>

The maximum number of tokens to generate

§temperature: Option<f32>

What sampling temperature to use, between 0 and 2

§top_p: Option<f32>

An alternative to sampling with temperature (nucleus sampling)

§n: Option<u32>

How many completions to generate for each prompt

§stream: bool

Whether to stream back partial progress

§stream_options: Option<StreamOptions>

Options for streaming response

§logprobs: Option<u32>

Include the log probabilities on the logprobs most likely tokens

§echo: bool

Echo back the prompt in addition to the completion

§stop: Option<StringOrArray>

Up to 4 sequences where the API will stop generating further tokens

§presence_penalty: Option<f32>

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far

§frequency_penalty: Option<f32>

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far

§best_of: Option<u32>

Generates best_of completions server-side and returns the “best”

§logit_bias: Option<HashMap<String, f32>>

Modify the likelihood of specified tokens appearing in the completion

§user: Option<String>

A unique identifier representing your end-user

§seed: Option<i64>

If specified, our system will make a best effort to sample deterministically

§top_k: Option<i32>

Top-k sampling parameter (-1 to disable)

§min_p: Option<f32>

Min-p nucleus sampling parameter

§min_tokens: Option<u32>

Minimum number of tokens to generate

§repetition_penalty: Option<f32>

Repetition penalty for reducing repetitive text

§regex: Option<String>

Regex constraint for output generation

§ebnf: Option<String>

EBNF grammar constraint for structured output

§json_schema: Option<String>

JSON schema constraint for structured output

§stop_token_ids: Option<Vec<u32>>

Specific token IDs to use as stop conditions

§no_stop_trim: bool

Skip trimming stop tokens from output

§ignore_eos: bool

Ignore end-of-sequence tokens during generation

§skip_special_tokens: bool

Skip special tokens during detokenization

§lora_path: Option<String>

Path to LoRA adapter(s) for model customization

§session_params: Option<HashMap<String, Value>>

Session parameters for continual prompting

§return_hidden_states: bool

Return model hidden states

§sampling_seed: Option<u64>

Sampling seed for deterministic outputs

§other: Map<String, Value>

Additional fields including bootstrap info for PD routing

Trait Implementations§

Source§

impl Clone for CompletionRequest

Source§

fn clone(&self) -> CompletionRequest

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for CompletionRequest

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de> Deserialize<'de> for CompletionRequest

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl GenerationRequest for CompletionRequest

Source§

fn is_stream(&self) -> bool

Check if the request is for streaming
Source§

fn get_model(&self) -> Option<&str>

Get the model name if specified
Source§

fn extract_text_for_routing(&self) -> String

Extract text content for routing decisions
Source§

impl Serialize for CompletionRequest

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,