Trait llm_weaver::Llm

source ·

pub trait Llm<T: Config>: Default + Sized + PartialEq + Eq + Clone + Debug + Copy + Send + Sync {
    type Tokens: Copy + ToRedisArgs + FromStr + Display + Debug + ToString + Serialize + Default + TryFrom<usize> + Unsigned + FromPrimitive + ToPrimitive + Sum + CheckedAdd + CheckedSub + SaturatingAdd + SaturatingSub + SaturatingMul + CheckedDiv + CheckedMul + Ord + Sync + Send;
    type Request: Clone + From<ContextMessage<T>> + Display + Send;
    type Response: Clone + Into<Option<String>> + Send;
    type Parameters: Debug + Clone + Send + Sync;

    const TOKEN_WORD_RATIO: BoundedU8<0, 100> = _;

    // Required methods
    fn max_context_length(&self) -> Self::Tokens;
    fn name(&self) -> &'static str;
    fn alias(&self) -> &'static str;
    fn count_tokens(content: &str) -> Result<Self::Tokens>;
    fn prompt<'life0, 'life1, 'async_trait>(
        &'life0 self,
        is_summarizing: bool,
        prompt_tokens: Self::Tokens,
        msgs: Vec<Self::Request>,
        params: &'life1 Self::Parameters,
        max_tokens: Self::Tokens
    ) -> Pin<Box<dyn Future<Output = Result<Self::Response>> + Send + 'async_trait>>
       where Self: 'async_trait,
             'life0: 'async_trait,
             'life1: 'async_trait;

    // Provided methods
    fn get_max_prompt_token_limit(&self) -> Self::Tokens { ... }
    fn get_max_completion_token_limit(&self) -> Option<Self::Tokens> { ... }
    fn ctx_msgs_to_prompt_requests(
        &self,
        msgs: &[ContextMessage<T>]
    ) -> Vec<Self::Request> { ... }
    fn convert_tokens_to_words(&self, tokens: Self::Tokens) -> Self::Tokens { ... }
}

Required Associated Types§

source

type Tokens: Copy + ToRedisArgs + FromStr + Display + Debug + ToString + Serialize + Default + TryFrom<usize> + Unsigned + FromPrimitive + ToPrimitive + Sum + CheckedAdd + CheckedSub + SaturatingAdd + SaturatingSub + SaturatingMul + CheckedDiv + CheckedMul + Ord + Sync + Send

Tokens are an LLM concept which represents pieces of words. For example, each ChatGPT token represents roughly 75% of a word.

This type is used primarily for tracking the number of tokens in a TapestryFragment and counting the number of tokens in a string.

This type is configurable to allow for different types of tokens to be used. For example, u16 can be used to represent the number of tokens in a string.

source

type Request: Clone + From<ContextMessage<T>> + Display + Send

Type representing the prompt request.

source

type Response: Clone + Into<Option<String>> + Send

Type representing the response to a prompt.

source

type Parameters: Debug + Clone + Send + Sync

Type representing the parameters for a prompt.

Provided Associated Constants§

source

const TOKEN_WORD_RATIO: BoundedU8<0, 100> = _

Token to word ratio.

Defaults to 75%

Required Methods§

source

fn max_context_length(&self) -> Self::Tokens

The maximum number of tokens that can be processed at once by an LLM model.

source

fn name(&self) -> &'static str

Get the model name.

This is used for logging purposes but also can be used to fetch a specific model based on &self. For example, the model passed to Loom::weave can be represented as an enum with a multitude of variants, each representing a different model.

source

fn alias(&self) -> &'static str

Alias for the model.

Can be used for any unforseen use cases where the model name is not sufficient.

source

fn count_tokens(content: &str) -> Result<Self::Tokens>

Calculates the number of tokens in a string.

This may vary depending on the type of tokens used by the LLM. In the case of ChatGPT, can be calculated using the tiktoken-rs crate.

source

fn prompt<'life0, 'life1, 'async_trait>( &'life0 self, is_summarizing: bool, prompt_tokens: Self::Tokens, msgs: Vec<Self::Request>, params: &'life1 Self::Parameters, max_tokens: Self::Tokens ) -> Pin<Box<dyn Future<Output = Result<Self::Response>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait,

Prompt LLM with the supplied messages and parameters.

Provided Methods§

source

fn get_max_prompt_token_limit(&self) -> Self::Tokens

Calculate the upperbound of tokens allowed for the current Config::PromptModel before a summary is generated.

This is calculated by multiplying the maximum context length (tokens) for the current Config::PromptModel by the Config::TOKEN_THRESHOLD_PERCENTILE and dividing by 100.

source