pub struct RequestBody {Show 32 fields
pub extra_body: Option<ExtraBody>,
pub extra_body_map: Option<Map<String, Value>>,
pub frequency_penalty: Option<f32>,
pub logprobs: Option<bool>,
pub max_completion_tokens: Option<u32>,
pub max_tokens: Option<u32>,
pub messages: Vec<Message>,
pub metadata: Option<HashMap<String, String>>,
pub modalities: Option<Vec<Modality>>,
pub model: String,
pub n: Option<u32>,
pub parallel_tool_calls: Option<bool>,
pub prediction: Option<ChatCompletionPredictionContentParam>,
pub presence_penalty: Option<f32>,
pub prompt_cache_key: Option<String>,
pub reasoning_effort: Option<String>,
pub response_format: Option<ResponseFormat>,
pub safety_identifier: Option<String>,
pub seed: Option<i64>,
pub service_tier: Option<ServiceTier>,
pub stop: Option<StopKeywords>,
pub store: Option<bool>,
pub stream: bool,
pub stream_options: Option<StreamOptions>,
pub temperature: Option<f32>,
pub tool_choice: Option<ToolChoice>,
pub tools: Option<Vec<RequestTool>>,
pub top_logprobs: Option<u32>,
pub top_p: Option<f32>,
pub user: Option<String>,
pub verbosity: Option<LowMediumHighEnum>,
pub web_search: Option<WebSearchOptions>,
}Expand description
Creates a model response for the given chat conversation.
§Example
use std::sync::LazyLock;
use futures_util::StreamExt;
use openai_interface::chat::create::request::{Message, RequestBody};
use openai_interface::rest::post::PostStream;
const DEEPSEEK_API_KEY: LazyLock<&str> =
LazyLock::new(|| include_str!("../../../keys/deepseek_domestic_key").trim());
const DEEPSEEK_CHAT_URL: &'static str = "https://api.deepseek.com/v1";
const DEEPSEEK_MODEL: &'static str = "deepseek-chat";
#[tokio::main]
async fn main() {
let request = RequestBody {
messages: vec![
Message::System {
content: "This is a request of test purpose. Reply briefly".to_string(),
name: None,
},
Message::User {
content: "What's your name?".to_string(),
name: None,
},
],
model: DEEPSEEK_MODEL.to_string(),
stream: true,
..Default::default()
};
let mut response = request
.get_stream_response_string(DEEPSEEK_CHAT_URL, *DEEPSEEK_API_KEY)
.await
.unwrap();
while let Some(chunk) = response.next().await {
println!("{}", chunk.unwrap());
}
}Fields§
§extra_body: Option<ExtraBody>Other request bodies that are not in standard OpenAI API.
extra_body_map: Option<Map<String, Value>>Other request bodies that are not in standard OpenAI API and not included in the ExtraBody struct.
frequency_penalty: Option<f32>Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.
logprobs: Option<bool>Whether to return log probabilities of the output tokens or not. If true,
returns the log probabilities of each output token returned in the content of
message.
max_completion_tokens: Option<u32>An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
max_tokens: Option<u32>The maximum number of tokens that can be generated in the chat completion.
Deprecated according to OpenAI’s Python SDK in favour of
max_completion_tokens.
messages: Vec<Message>A list of messages comprising the conversation so far.
metadata: Option<HashMap<String, String>>Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
modalities: Option<Vec<Modality>>Output types that you would like the model to generate. Most models are capable of generating text, which is the default:
["text"]
The gpt-4o-audio-preview model can also be used to
generate audio. To request that
this model generate both text and audio responses, you can use:
["text", "audio"]
model: StringName of the model to use to generate the response.
n: Option<u32>How many chat completion choices to generate for each input message. Note that
you will be charged based on the number of generated tokens across all of the
choices. Keep n as 1 to minimize costs.
parallel_tool_calls: Option<bool>Whether to enable parallel function calling during tool use.
prediction: Option<ChatCompletionPredictionContentParam>Static predicted output content, such as the content of a text file that is being regenerated.
presence_penalty: Option<f32>Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.
prompt_cache_key: Option<String>Used by OpenAI to cache responses for similar requests to optimize your cache
hit rates. Replaces the user field.
Learn more.
reasoning_effort: Option<String>Constrains effort on reasoning for
reasoning models. Currently
supported values are minimal, low, medium, and high. Reducing reasoning
effort can result in faster responses and fewer tokens used on reasoning in a
response.
response_format: Option<ResponseFormat>specifying the format that the model must output.
Setting to { "type": "json_schema", "json_schema": {...} } enables Structured
Outputs which ensures the model will match your supplied JSON schema. Learn more
in the
Structured Outputs guide.
Setting to { "type": "json_object" } enables the older JSON mode, which
ensures the message the model generates is valid JSON. Using json_schema is
preferred for models that support it.
safety_identifier: Option<String>A stable identifier used to help detect users of your application that may be violating OpenAI’s usage policies. The IDs should be a string that uniquely identifies each user. It is recommended to hash their username or email address, in order to avoid sending any identifying information.
seed: Option<i64>If specified, the system will make a best effort to sample deterministically. Determinism
is not guaranteed, and you should refer to the system_fingerprint response parameter to
monitor changes in the backend.
service_tier: Option<ServiceTier>Specifies the processing type used for serving the request.
- If set to ‘auto’, then the request will be processed with the service tier configured in the Project settings. Unless otherwise configured, the Project will use ‘default’.
- If set to ‘default’, then the request will be processed with the standard pricing and performance for the selected model.
- If set to ‘flex’ or ‘priority’, then the request will be processed with the corresponding service tier.
- When not set, the default behavior is ‘auto’.
When the service_tier parameter is set, the response body will include the
service_tier value based on the processing mode actually used to serve the
request. This response value may be different from the value set in the
parameter.
stop: Option<StopKeywords>Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
store: Option<bool>Whether or not to store the output of this chat completion request for use in our model distillation or evals products.
Supports text and image inputs. Note: image inputs over 8MB will be dropped.
stream: boolAlthough it is optional, you should explicitly designate it for an expected response.
stream_options: Option<StreamOptions>Options for streaming response. Only set this when you set stream: true
temperature: Option<f32>What sampling temperature to use, between 0 and 2. Higher values like 0.8 will
make the output more random, while lower values like 0.2 will make it more
focused and deterministic. It is generally recommended to alter this or top_p but
not both.
tool_choice: Option<ToolChoice>Controls which (if any) tool is called by the model. none means the model will
not call any tool and instead generates a message. auto means the model can
pick between generating a message or calling one or more tools. required means
the model must call one or more tools. Specifying a particular tool via
{"type": "function", "function": {"name": "my_function"}} forces the model to
call that tool.
tools: Option<Vec<RequestTool>>A list of tools the model may call.
top_logprobs: Option<u32>An integer between 0 and 20 specifying the number of most likely tokens to
return at each token position, each with an associated log probability.
logprobs must be set to true if this parameter is used.
top_p: Option<f32>An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
It is generally recommended to alter this or temperature but not both.
user: Option<String>This field is being replaced by safety_identifier and prompt_cache_key. Use
prompt_cache_key instead to maintain caching optimizations. A stable
identifier for your end-users. Used to boost cache hit rates by better bucketing
similar requests and to help OpenAI detect and prevent abuse.
Learn more.
verbosity: Option<LowMediumHighEnum>Constrains the verbosity of the model’s response. Lower values will result in
more concise responses, while higher values will result in more verbose
responses. Currently supported values are low, medium, and high.
web_search: Option<WebSearchOptions>This tool searches the web for relevant results to use in a response. Learn more about the web search tool.
Trait Implementations§
Source§impl Clone for RequestBody
impl Clone for RequestBody
Source§fn clone(&self) -> RequestBody
fn clone(&self) -> RequestBody
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more