Trait llm_weaver::Llm
source · pub trait Llm<T: Config>: Default + Sized + PartialEq + Eq + Clone + Debug + Copy + Send + Sync {
type Tokens: Copy + ToRedisArgs + FromStr + Display + Debug + ToString + Serialize + Default + TryFrom<usize> + Unsigned + FromPrimitive + ToPrimitive + Sum + CheckedAdd + CheckedSub + SaturatingAdd + SaturatingSub + SaturatingMul + CheckedDiv + CheckedMul + Ord + Sync + Send;
type Request: Clone + From<ContextMessage<T>> + Display + Send;
type Response: Clone + Into<Option<String>> + Send;
type Parameters: Debug + Clone + Send + Sync;
const TOKEN_WORD_RATIO: BoundedU8<0, 100> = _;
// Required methods
fn max_context_length(&self) -> Self::Tokens;
fn name(&self) -> &'static str;
fn alias(&self) -> &'static str;
fn count_tokens(content: &str) -> Result<Self::Tokens>;
fn prompt<'life0, 'life1, 'async_trait>(
&'life0 self,
is_summarizing: bool,
prompt_tokens: Self::Tokens,
msgs: Vec<Self::Request>,
params: &'life1 Self::Parameters,
max_tokens: Self::Tokens
) -> Pin<Box<dyn Future<Output = Result<Self::Response>> + Send + 'async_trait>>
where Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait;
// Provided methods
fn get_max_prompt_token_limit(&self) -> Self::Tokens { ... }
fn get_max_completion_token_limit(&self) -> Option<Self::Tokens> { ... }
fn ctx_msgs_to_prompt_requests(
&self,
msgs: &[ContextMessage<T>]
) -> Vec<Self::Request> { ... }
fn convert_tokens_to_words(&self, tokens: Self::Tokens) -> Self::Tokens { ... }
}
Required Associated Types§
sourcetype Tokens: Copy + ToRedisArgs + FromStr + Display + Debug + ToString + Serialize + Default + TryFrom<usize> + Unsigned + FromPrimitive + ToPrimitive + Sum + CheckedAdd + CheckedSub + SaturatingAdd + SaturatingSub + SaturatingMul + CheckedDiv + CheckedMul + Ord + Sync + Send
type Tokens: Copy + ToRedisArgs + FromStr + Display + Debug + ToString + Serialize + Default + TryFrom<usize> + Unsigned + FromPrimitive + ToPrimitive + Sum + CheckedAdd + CheckedSub + SaturatingAdd + SaturatingSub + SaturatingMul + CheckedDiv + CheckedMul + Ord + Sync + Send
Tokens are an LLM concept which represents pieces of words. For example, each ChatGPT token represents roughly 75% of a word.
This type is used primarily for tracking the number of tokens in a TapestryFragment
and
counting the number of tokens in a string.
This type is configurable to allow for different types of tokens to be used. For example,
u16
can be used to represent the number of tokens in a string.
Provided Associated Constants§
sourceconst TOKEN_WORD_RATIO: BoundedU8<0, 100> = _
const TOKEN_WORD_RATIO: BoundedU8<0, 100> = _
Token to word ratio.
Defaults to 75%
Required Methods§
sourcefn max_context_length(&self) -> Self::Tokens
fn max_context_length(&self) -> Self::Tokens
The maximum number of tokens that can be processed at once by an LLM model.
sourcefn name(&self) -> &'static str
fn name(&self) -> &'static str
Get the model name.
This is used for logging purposes but also can be used to fetch a specific model based on
&self
. For example, the model passed to Loom::weave
can be represented as an enum with
a multitude of variants, each representing a different model.
sourcefn alias(&self) -> &'static str
fn alias(&self) -> &'static str
Alias for the model.
Can be used for any unforseen use cases where the model name is not sufficient.
sourcefn count_tokens(content: &str) -> Result<Self::Tokens>
fn count_tokens(content: &str) -> Result<Self::Tokens>
Calculates the number of tokens in a string.
This may vary depending on the type of tokens used by the LLM. In the case of ChatGPT, can be calculated using the tiktoken-rs crate.
sourcefn prompt<'life0, 'life1, 'async_trait>(
&'life0 self,
is_summarizing: bool,
prompt_tokens: Self::Tokens,
msgs: Vec<Self::Request>,
params: &'life1 Self::Parameters,
max_tokens: Self::Tokens
) -> Pin<Box<dyn Future<Output = Result<Self::Response>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait,
fn prompt<'life0, 'life1, 'async_trait>(
&'life0 self,
is_summarizing: bool,
prompt_tokens: Self::Tokens,
msgs: Vec<Self::Request>,
params: &'life1 Self::Parameters,
max_tokens: Self::Tokens
) -> Pin<Box<dyn Future<Output = Result<Self::Response>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait,
Prompt LLM with the supplied messages and parameters.
Provided Methods§
sourcefn get_max_prompt_token_limit(&self) -> Self::Tokens
fn get_max_prompt_token_limit(&self) -> Self::Tokens
Calculate the upperbound of tokens allowed for the current Config::PromptModel
before a
summary is generated.
This is calculated by multiplying the maximum context length (tokens) for the current
Config::PromptModel
by the Config::TOKEN_THRESHOLD_PERCENTILE
and dividing by 100.
sourcefn get_max_completion_token_limit(&self) -> Option<Self::Tokens>
fn get_max_completion_token_limit(&self) -> Option<Self::Tokens>
Get optional max completion token limit.
sourcefn ctx_msgs_to_prompt_requests(
&self,
msgs: &[ContextMessage<T>]
) -> Vec<Self::Request>
fn ctx_msgs_to_prompt_requests( &self, msgs: &[ContextMessage<T>] ) -> Vec<Self::Request>
ContextMessage
s to Llm::Request
conversion.
sourcefn convert_tokens_to_words(&self, tokens: Self::Tokens) -> Self::Tokens
fn convert_tokens_to_words(&self, tokens: Self::Tokens) -> Self::Tokens
Convert tokens to words.
In the case of ChatGPT, each token represents roughly 75% of a word.