pub struct ChatCompletionRequest {Show 20 fields
pub model: String,
pub temperature: Option<Option<f64>>,
pub top_p: Option<f64>,
pub max_tokens: Option<Option<i32>>,
pub stream: Option<bool>,
pub stop: Option<Box<Stop>>,
pub random_seed: Option<Option<i32>>,
pub metadata: Option<Option<HashMap<String, Value>>>,
pub messages: Vec<MessagesInner>,
pub response_format: Option<Box<ResponseFormat>>,
pub tools: Option<Option<Vec<Tool>>>,
pub tool_choice: Option<Box<ToolChoice>>,
pub presence_penalty: Option<f64>,
pub frequency_penalty: Option<f64>,
pub n: Option<Option<i32>>,
pub prediction: Option<Box<Prediction>>,
pub parallel_tool_calls: Option<bool>,
pub prompt_mode: Option<Option<MistralPromptMode>>,
pub guardrails: Option<Option<Vec<GuardrailConfig>>>,
pub safe_prompt: Option<bool>,
}Fields§
§model: StringID of the model to use. You can use the List Available Models API to see all of your available models, or see our Model overview for model descriptions.
temperature: Option<Option<f64>>§top_p: Option<f64>Nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.
max_tokens: Option<Option<i32>>§stream: Option<bool>Whether to stream back partial progress. If set, tokens will be sent as data-only server-side events as they become available, with the stream terminated by a data: [DONE] message. Otherwise, the server will hold the request open until the timeout or until completion, with the response containing the full result as JSON.
stop: Option<Box<Stop>>§random_seed: Option<Option<i32>>§metadata: Option<Option<HashMap<String, Value>>>§messages: Vec<MessagesInner>The prompt(s) to generate completions for, encoded as a list of dict with role and content.
response_format: Option<Box<ResponseFormat>>§tools: Option<Option<Vec<Tool>>>§tool_choice: Option<Box<ToolChoice>>§presence_penalty: Option<f64>The presence_penalty determines how much the model penalizes the repetition of words or phrases. A higher presence penalty encourages the model to use a wider variety of words and phrases, making the output more diverse and creative.
frequency_penalty: Option<f64>The frequency_penalty penalizes the repetition of words based on their frequency in the generated text. A higher frequency penalty discourages the model from repeating words that have already appeared frequently in the output, promoting diversity and reducing repetition.
n: Option<Option<i32>>§prediction: Option<Box<Prediction>>Enable users to specify expected results, optimizing response times by leveraging known or predictable content. This approach is especially effective for updating text documents or code files with minimal changes, reducing latency while maintaining high-quality results.
parallel_tool_calls: Option<bool>Whether to enable parallel function calling during tool use, when enabled the model can call multiple tools in parallel.
prompt_mode: Option<Option<MistralPromptMode>>§guardrails: Option<Option<Vec<GuardrailConfig>>>§safe_prompt: Option<bool>Whether to inject a safety prompt before all conversations.
Implementations§
Source§impl ChatCompletionRequest
impl ChatCompletionRequest
pub fn new(model: String, messages: Vec<MessagesInner>) -> ChatCompletionRequest
Trait Implementations§
Source§impl Clone for ChatCompletionRequest
impl Clone for ChatCompletionRequest
Source§fn clone(&self) -> ChatCompletionRequest
fn clone(&self) -> ChatCompletionRequest
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more