pub struct RequestConfig {
pub temperature: Option<f64>,
pub max_tokens: Option<u32>,
pub top_p: Option<f64>,
pub top_k: Option<u32>,
pub min_p: Option<f64>,
pub presence_penalty: Option<f64>,
pub response_format: Option<ResponseFormat>,
pub tools: Vec<Tool>,
pub tool_choice: Option<ToolChoice>,
pub user_id: Option<String>,
pub session_id: Option<String>,
pub llm_path: Option<String>,
}Expand description
Configuration for a single LLM request.
Override default provider settings on a per-request basis. All fields are optional - unset fields use the provider’s defaults.
§Basic Usage
use multi_llm::RequestConfig;
let config = RequestConfig {
temperature: Some(0.7),
max_tokens: Some(1000),
..Default::default()
};§With Tools
use multi_llm::{RequestConfig, Tool, ToolChoice};
let weather_tool = Tool {
name: "get_weather".to_string(),
description: "Get weather for a city".to_string(),
parameters: serde_json::json!({"type": "object", "properties": {}}),
};
let config = RequestConfig {
tools: vec![weather_tool],
tool_choice: Some(ToolChoice::Auto),
..Default::default()
};§Sampling Parameters
| Parameter | Range | Effect |
|---|---|---|
temperature | 0.0-2.0 | Randomness (0=deterministic, 2=very random) |
top_p | 0.0-1.0 | Nucleus sampling threshold |
top_k | 1+ | Limit vocab to top K tokens |
presence_penalty | -2.0-2.0 | Discourage repetition |
Fields§
§temperature: Option<f64>Temperature for response randomness.
0.0: Deterministic (always pick most likely token)0.7: Balanced (good default for most tasks)1.0+: More creative/random
Range: 0.0 to 2.0 (provider-dependent)
max_tokens: Option<u32>Maximum tokens to generate in the response.
Limits response length. The actual response may be shorter if the model completes its thought naturally.
top_p: Option<f64>Top-p (nucleus) sampling parameter.
Only consider tokens whose cumulative probability exceeds this threshold. Lower values = more focused, higher values = more diverse. Range: 0.0 to 1.0 (typically 0.9-0.95)
top_k: Option<u32>Top-k sampling parameter.
Only consider the top K most likely tokens at each step. Lower values = more focused. Not all providers support this.
min_p: Option<f64>Min-p sampling parameter.
Filter tokens below this probability relative to the top token. Range: 0.0 to 1.0. Not all providers support this.
presence_penalty: Option<f64>Presence penalty to discourage repetition.
Positive values reduce likelihood of repeating tokens that have appeared. Range: -2.0 to 2.0 (typically 0.0 to 1.0)
response_format: Option<ResponseFormat>Response format for structured output.
When set, the model attempts to return JSON matching the schema.
Use with LlmProvider::execute_structured_llm() for best results.
tools: Vec<Tool>Tools available for this request.
Define functions the LLM can call. See Tool for details.
tool_choice: Option<ToolChoice>Strategy for tool selection.
Controls whether tools are optional, required, or disabled.
See ToolChoice for options.
user_id: Option<String>User ID for analytics and cache analysis.
Helps track cache hit rates per user and debug user-specific issues.
session_id: Option<String>Session ID for session-level analytics.
Track cache performance and behavior within a conversation session.
llm_path: Option<String>LLM path context for distinguishing call types.
Useful when your application has multiple LLM call paths (e.g., “chat”, “analysis”, “summarization”).
Trait Implementations§
Source§impl Clone for RequestConfig
impl Clone for RequestConfig
Source§fn clone(&self) -> RequestConfig
fn clone(&self) -> RequestConfig
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more