pub trait Llm<T: Config>:
Default
+ Sized
+ PartialEq
+ Eq
+ Clone
+ Debug
+ Copy
+ Send
+ Sync {
type Tokens: Copy + ToString + FromStr + Display + Debug + Serialize + DeserializeOwned + Default + TryFrom<usize> + Unsigned + FromPrimitive + ToPrimitive + Sum + CheckedAdd + CheckedSub + SaturatingAdd + SaturatingSub + SaturatingMul + CheckedDiv + CheckedMul + Ord + Sync + Send;
type Request: Clone + From<ContextMessage<T>> + Display + Send;
type Response: Clone + Into<Option<String>> + Send;
type Parameters: Debug + Clone + Send + Sync;
const TOKEN_WORD_RATIO: BoundedU8<0, 100> = _;
// Required methods
fn max_context_length(&self) -> Self::Tokens;
fn name(&self) -> &'static str;
fn alias(&self) -> &'static str;
fn count_tokens(content: &str) -> Result<Self::Tokens>;
fn prompt<'life0, 'life1, 'async_trait>(
&'life0 self,
is_summarizing: bool,
prompt_tokens: Self::Tokens,
msgs: Vec<Self::Request>,
params: &'life1 Self::Parameters,
max_tokens: Self::Tokens,
) -> Pin<Box<dyn Future<Output = Result<Self::Response>> + Send + 'async_trait>>
where Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait;
fn compute_cost(
&self,
prompt_tokens: Self::Tokens,
response_tokens: Self::Tokens,
) -> f64;
// Provided methods
fn get_max_prompt_token_limit(&self) -> Self::Tokens { ... }
fn get_max_completion_token_limit(&self) -> Option<Self::Tokens> { ... }
fn ctx_msgs_to_prompt_requests(
&self,
msgs: &[ContextMessage<T>],
) -> Vec<Self::Request> { ... }
fn convert_tokens_to_words(&self, tokens: Self::Tokens) -> Self::Tokens { ... }
}
Provided Associated Constants§
Sourceconst TOKEN_WORD_RATIO: BoundedU8<0, 100> = _
const TOKEN_WORD_RATIO: BoundedU8<0, 100> = _
Token to word ratio.
Defaults to 75%
Required Associated Types§
Sourcetype Tokens: Copy + ToString + FromStr + Display + Debug + Serialize + DeserializeOwned + Default + TryFrom<usize> + Unsigned + FromPrimitive + ToPrimitive + Sum + CheckedAdd + CheckedSub + SaturatingAdd + SaturatingSub + SaturatingMul + CheckedDiv + CheckedMul + Ord + Sync + Send
type Tokens: Copy + ToString + FromStr + Display + Debug + Serialize + DeserializeOwned + Default + TryFrom<usize> + Unsigned + FromPrimitive + ToPrimitive + Sum + CheckedAdd + CheckedSub + SaturatingAdd + SaturatingSub + SaturatingMul + CheckedDiv + CheckedMul + Ord + Sync + Send
Tokens are an LLM concept which represents pieces of words. For example, each ChatGPT token represents roughly 75% of a word.
This type is used primarily for tracking the number of tokens in a TapestryFragment
and
counting the number of tokens in a string.
This type is configurable to allow for different types of tokens to be used. For example,
u16
can be used to represent the number of tokens in a string.
Required Methods§
Sourcefn max_context_length(&self) -> Self::Tokens
fn max_context_length(&self) -> Self::Tokens
The maximum number of tokens that can be processed at once by an LLM model.
Sourcefn name(&self) -> &'static str
fn name(&self) -> &'static str
Get the model name.
This is used for logging purposes but also can be used to fetch a specific model based on
&self
. For example, the model passed to [Loom::weave
] can be represented as an enum with
a multitude of variants, each representing a different model.
Sourcefn alias(&self) -> &'static str
fn alias(&self) -> &'static str
Alias for the model.
Can be used for any unforseen use cases where the model name is not sufficient.
Sourcefn count_tokens(content: &str) -> Result<Self::Tokens>
fn count_tokens(content: &str) -> Result<Self::Tokens>
Calculates the number of tokens in a string.
This may vary depending on the type of tokens used by the LLM. In the case of ChatGPT, can be calculated using the tiktoken-rs crate.
Sourcefn prompt<'life0, 'life1, 'async_trait>(
&'life0 self,
is_summarizing: bool,
prompt_tokens: Self::Tokens,
msgs: Vec<Self::Request>,
params: &'life1 Self::Parameters,
max_tokens: Self::Tokens,
) -> Pin<Box<dyn Future<Output = Result<Self::Response>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait,
fn prompt<'life0, 'life1, 'async_trait>(
&'life0 self,
is_summarizing: bool,
prompt_tokens: Self::Tokens,
msgs: Vec<Self::Request>,
params: &'life1 Self::Parameters,
max_tokens: Self::Tokens,
) -> Pin<Box<dyn Future<Output = Result<Self::Response>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait,
Prompt LLM with the supplied messages and parameters.
Sourcefn compute_cost(
&self,
prompt_tokens: Self::Tokens,
response_tokens: Self::Tokens,
) -> f64
fn compute_cost( &self, prompt_tokens: Self::Tokens, response_tokens: Self::Tokens, ) -> f64
Compute cost of a message based on model.
Provided Methods§
Sourcefn get_max_prompt_token_limit(&self) -> Self::Tokens
fn get_max_prompt_token_limit(&self) -> Self::Tokens
Calculate the upperbound of tokens allowed for the current Config::PromptModel
before a
summary is generated.
This is calculated by multiplying the maximum context length (tokens) for the current
Config::PromptModel
by the Config::TOKEN_THRESHOLD_PERCENTILE
and dividing by 100.
Sourcefn get_max_completion_token_limit(&self) -> Option<Self::Tokens>
fn get_max_completion_token_limit(&self) -> Option<Self::Tokens>
Get optional max completion token limit.
Sourcefn ctx_msgs_to_prompt_requests(
&self,
msgs: &[ContextMessage<T>],
) -> Vec<Self::Request>
fn ctx_msgs_to_prompt_requests( &self, msgs: &[ContextMessage<T>], ) -> Vec<Self::Request>
ContextMessage
s to Llm::Request
conversion.
Sourcefn convert_tokens_to_words(&self, tokens: Self::Tokens) -> Self::Tokens
fn convert_tokens_to_words(&self, tokens: Self::Tokens) -> Self::Tokens
Convert tokens to words.
In the case of ChatGPT, each token represents roughly 75% of a word.
Dyn Compatibility§
This trait is not dyn compatible.
In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.