pub struct ChatRequest {Show 27 fields
pub messages: Vec<Message>,
pub model: Option<String>,
pub temperature: Option<f32>,
pub top_p: Option<f32>,
pub n: Option<i32>,
pub stream: Option<bool>,
pub stream_options: Option<StreamOptions>,
pub stop: Option<Vec<String>>,
pub max_completion_tokens: Option<i32>,
pub max_tokens: Option<i32>,
pub presence_penalty: Option<f32>,
pub frequency_penalty: Option<f32>,
pub logit_bias: Option<Value>,
pub logprobs: Option<bool>,
pub top_logprobs: Option<i32>,
pub user: Option<String>,
pub response_format: Option<ResponseFormat>,
pub seed: Option<i32>,
pub tools: Option<Vec<Tool>>,
pub tool_choice: Option<ToolChoice>,
pub parallel_tool_calls: Option<bool>,
pub search_parameters: Option<SearchParameters>,
pub web_search_options: Option<WebSearchOptions>,
pub reasoning_effort: Option<String>,
pub deferred: Option<bool>,
pub bootstrap_host: Option<String>,
pub bootstrap_room: Option<i64>,
}Expand description
The chat request body for /v1/chat/completions endpoint.
Fields§
§messages: Vec<Message>A list of messages that make up the chat conversation.
model: Option<String>Model name for the model to use.
temperature: Option<f32>What sampling temperature to use, between 0 and 2.
top_p: Option<f32>An alternative to sampling with temperature, called nucleus sampling.
n: Option<i32>How many chat completion choices to generate for each input message.
stream: Option<bool>If set, partial message deltas will be sent as server-sent events.
stream_options: Option<StreamOptions>Options for streaming response.
stop: Option<Vec<String>>(Not supported by reasoning models) Up to 4 sequences where the API will stop generating.
max_completion_tokens: Option<i32>An upper bound for the number of tokens that can be generated for a completion.
max_tokens: Option<i32>[DEPRECATED] The maximum number of tokens that can be generated.
presence_penalty: Option<f32>(Not supported by grok-3 and reasoning models) Presence penalty.
frequency_penalty: Option<f32>(Not supported by reasoning models) Frequency penalty.
logit_bias: Option<Value>(Unsupported) A JSON object that maps tokens to an associated bias value.
logprobs: Option<bool>Whether to return log probabilities of the output tokens.
top_logprobs: Option<i32>Number of most likely tokens to return at each token position.
user: Option<String>A unique identifier representing your end-user.
response_format: Option<ResponseFormat>An object specifying the format that the model must output.
seed: Option<i32>If specified, system will make a best effort to sample deterministically.
tools: Option<Vec<Tool>>A list of tools the model may call.
tool_choice: Option<ToolChoice>Controls which (if any) tool is called by the model.
parallel_tool_calls: Option<bool>If set to false, the model can perform maximum one tool call.
search_parameters: Option<SearchParameters>Set the parameters to be used for searched data.
web_search_options: Option<WebSearchOptions>Options to control the web search.
reasoning_effort: Option<String>Constrains how hard a reasoning model thinks before responding.
deferred: Option<bool>If set to true, the request returns a request_id for deferred completion.
bootstrap_host: Option<String>(Internal) Bootstrap host address for disaggregated prefill/decode.
bootstrap_room: Option<i64>(Internal) Bootstrap room ID for disaggregated prefill/decode.
Trait Implementations§
Source§impl Clone for ChatRequest
impl Clone for ChatRequest
Source§fn clone(&self) -> ChatRequest
fn clone(&self) -> ChatRequest
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more