Struct endpoints::rag::RagChatCompletionRequestBuilder

source ·

pub struct RagChatCompletionRequestBuilder { /* private fields */ }

Expand description

Request builder for creating a new RAG chat completion request.

Implementations§

source §

impl RagChatCompletionRequestBuilder

source

pub fn new( messages: Vec<ChatCompletionRequestMessage>, qdrant_url: impl Into<String>, qdrant_collection_name: impl Into<String>, limit: u64, ) -> Self

Creates a new builder with the given model.

§Arguments

model - ID of the model to use.
messages - A list of messages comprising the conversation so far.
sampling - The sampling method to use.

source

pub fn with_sampling(self, sampling: ChatCompletionRequestSampling) -> Self

source

pub fn with_n_choices(self, n: u64) -> Self

Sets the number of chat completion choices to generate for each input message.

§Arguments

n - How many chat completion choices to generate for each input message. If n is less than 1, then sets to 1.

source

pub fn with_stream(self, flag: bool) -> Self

source

pub fn with_stop(self, stop: Vec<String>) -> Self

source

pub fn with_max_tokens(self, max_tokens: u64) -> Self

Sets the maximum number of tokens to generate in the chat completion. The total length of input tokens and generated tokens is limited by the model’s context length.

§Argument

max_tokens - The maximum number of tokens to generate in the chat completion. If max_tokens is less than 1, then sets to 16.

source

pub fn with_presence_penalty(self, penalty: f64) -> Self

Sets the presence penalty. Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.

source

pub fn with_frequency_penalty(self, penalty: f64) -> Self

Sets the frequency penalty. Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.

source