pub struct ChatCompletionRequest {Show 16 fields
pub messages: Vec<Message>,
pub model: Model,
pub thinking: Option<Thinking>,
pub frequency_penalty: Option<f32>,
pub max_tokens: Option<u32>,
pub presence_penalty: Option<f32>,
pub response_format: Option<ResponseFormat>,
pub stop: Option<Stop>,
pub stream: Option<bool>,
pub stream_options: Option<StreamOptions>,
pub temperature: Option<f32>,
pub top_p: Option<f32>,
pub tools: Option<Vec<Tool>>,
pub tool_choice: Option<ToolChoice>,
pub logprobs: Option<bool>,
pub top_logprobs: Option<u32>,
}Fields§
§messages: Vec<Message>List of messages in the conversation.
model: ModelThe model ID to use. Use deepseek-chat for faster responses or deepseek-reasoner for deeper reasoning capabilities.
thinking: Option<Thinking>Controls switching between reasoning (thinking) and non-reasoning modes.
frequency_penalty: Option<f32>Possible values: >= -2 and <= 2 Default value: 0 A number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text, reducing the chance of repeated content.
max_tokens: Option<u32>Maximum number of tokens to generate for the completion in a single request. The combined length of input and output tokens is limited by the model’s context window. See documentation for ranges and defaults.
presence_penalty: Option<f32>Possible values: >= -2 and <= 2 Default value: 0 A number between -2.0 and 2.0. Positive values penalize new tokens if they already appear in the text, encouraging the model to introduce new topics.
response_format: Option<ResponseFormat>An object specifying the format the model must output.
Set to { "type": "json_object" } to enable JSON mode which enforces valid JSON output.
Note: When using JSON mode you must also instruct the model via system or user messages to output JSON.
Otherwise the model may emit whitespace until token limits are reached which can appear to hang.
Also, if finish_reason == "length", the output may be truncated due to max_tokens or context limits.
stop: Option<Stop>A string or up to 16 strings. Generation will stop when one of these tokens is encountered.
stream: Option<bool>If true, the response will be streamed as SSE (server-sent events). The stream ends with data: [DONE].
stream_options: Option<StreamOptions>Options related to streaming output. Only valid when stream is true.
include_usage: boolean
If true, an extra chunk with usage (aggregate token counts) will be sent before the final data: [DONE].
Other chunks also include usage but with a null value.
temperature: Option<f32>Possible values: <= 2
Default value: 1
Sampling temperature between 0 and 2. Higher values (e.g. 0.8) produce more random output;
lower values (e.g. 0.2) make output more focused and deterministic.
Typically change either temperature or top_p, not both.
top_p: Option<f32>Possible values: <= 1
Default value: 1
An alternative to temperature that considers only the top p probability mass.
For example, top_p = 0.1 means only tokens comprising the top 10% probability mass are considered.
tools: Option<Vec<Tool>>List of tools the model may call. Currently only function is supported.
Provide a list of functions that accept JSON input. Up to 128 functions are supported.
tool_choice: Option<ToolChoice>Controls how the model may call tools:
none: the model will not call tools and will produce a normal message.auto: the model can choose to produce a message or call one or more tools.required: the model must call one or more tools.
Specifying a particular tool via {"type":"function","function":{"name":"my_function"}} forces the model to call that tool.
Default is none when no tools exist; when tools exist the default is auto.
logprobs: Option<bool>logprobs boolean NULLABLE Return log-probabilities for the output tokens. If true, logprobs for each output token are returned.
top_logprobs: Option<u32>Possible values: <= 20
An integer N between 0 and 20 that returns the top-N token log-probabilities for each output position.
When specifying this parameter, logprobs must be true.