Struct endpoints::rag::RagChatCompletionRequestBuilder
source · pub struct RagChatCompletionRequestBuilder { /* private fields */ }
Expand description
Request builder for creating a new RAG chat completion request.
Implementations§
source§impl RagChatCompletionRequestBuilder
impl RagChatCompletionRequestBuilder
sourcepub fn new(
messages: Vec<ChatCompletionRequestMessage>,
qdrant_url: impl Into<String>,
qdrant_collection_name: impl Into<String>,
limit: u64,
) -> Self
pub fn new( messages: Vec<ChatCompletionRequestMessage>, qdrant_url: impl Into<String>, qdrant_collection_name: impl Into<String>, limit: u64, ) -> Self
Creates a new builder with the given model.
§Arguments
-
model
- ID of the model to use. -
messages
- A list of messages comprising the conversation so far. -
sampling
- The sampling method to use.
pub fn with_sampling(self, sampling: ChatCompletionRequestSampling) -> Self
sourcepub fn with_n_choices(self, n: u64) -> Self
pub fn with_n_choices(self, n: u64) -> Self
Sets the number of chat completion choices to generate for each input message.
§Arguments
n
- How many chat completion choices to generate for each input message. Ifn
is less than 1, then sets to1
.
pub fn with_stream(self, flag: bool) -> Self
pub fn with_stop(self, stop: Vec<String>) -> Self
sourcepub fn with_max_tokens(self, max_tokens: u64) -> Self
pub fn with_max_tokens(self, max_tokens: u64) -> Self
Sets the maximum number of tokens to generate in the chat completion. The total length of input tokens and generated tokens is limited by the model’s context length.
§Argument
max_tokens
- The maximum number of tokens to generate in the chat completion. Ifmax_tokens
is less than 1, then sets to16
.
sourcepub fn with_presence_penalty(self, penalty: f64) -> Self
pub fn with_presence_penalty(self, penalty: f64) -> Self
Sets the presence penalty. Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.
sourcepub fn with_frequency_penalty(self, penalty: f64) -> Self
pub fn with_frequency_penalty(self, penalty: f64) -> Self
Sets the frequency penalty. Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.