pub struct Llama { /* private fields */ }
Expand description
The LLaMA model. Ref: Introducing LLaMA
Safety
This implements Send and Sync as it is immutable after construction.
Implementations§
source§impl Llama
impl Llama
sourcepub fn load(
path: &Path,
params: ModelParameters,
load_progress_callback: impl FnMut(LoadProgress)
) -> Result<Llama, LoadError>
pub fn load( path: &Path, params: ModelParameters, load_progress_callback: impl FnMut(LoadProgress) ) -> Result<Llama, LoadError>
Load a LLaMA model from the path
and configure it per the params
. The status
of the loading process will be reported through load_progress_callback
. This
is a helper function on top of llm_base::load.
Trait Implementations§
source§impl KnownModel for Llama
impl KnownModel for Llama
source§fn start_session(&self, config: InferenceSessionConfig) -> InferenceSession
fn start_session(&self, config: InferenceSessionConfig) -> InferenceSession
Starts a new InferenceSession
for this model.
source§fn vocabulary(&self) -> &Vocabulary
fn vocabulary(&self) -> &Vocabulary
Returns the vocabulary used by this model.
§type Hyperparameters = Hyperparameters
type Hyperparameters = Hyperparameters
Hyperparameters for the model
source§fn new<E>(
hyperparameters: <Llama as KnownModel>::Hyperparameters,
params: ModelParameters,
vocabulary: Vocabulary,
tensor_loader: impl TensorLoader<E>
) -> Result<Llama, E>where
E: Error,
fn new<E>( hyperparameters: <Llama as KnownModel>::Hyperparameters, params: ModelParameters, vocabulary: Vocabulary, tensor_loader: impl TensorLoader<E> ) -> Result<Llama, E>where E: Error,
Creates a new model from the provided ModelParameters hyperparameters.
This function is called by the load function.
source§fn evaluate(
&self,
session: &mut InferenceSession,
params: &InferenceParameters,
input_tokens: &[i32],
output_request: &mut OutputRequest
)
fn evaluate( &self, session: &mut InferenceSession, params: &InferenceParameters, input_tokens: &[i32], output_request: &mut OutputRequest )
This function is called by the provided InferenceSession; it will use this model
and the InferenceParameters to generate output by evaluating the
input_tokens
.
The OutputRequest is used to specify additional data to fetch from the
model.source§fn n_context_tokens(&self) -> usize
fn n_context_tokens(&self) -> usize
Get the context size (configured with ModelParameters::n_context_tokens) used by
this model.
source§fn bot_token_id(&self) -> Option<i32>
fn bot_token_id(&self) -> Option<i32>
Get the beginning of text/beginning of string token ID, if available. This value is defined by model implementers.
source§fn eot_token_id(&self) -> i32
fn eot_token_id(&self) -> i32
Get the end of text/end of string token ID. This value is defined by model implementers.
source§fn inference_parameters(&self) -> &InferenceParameters
fn inference_parameters(&self) -> &InferenceParameters
Get the default InferenceParameters for this model (used by
InferenceSession::infer). This value is configured through
ModelParameters::inference_parameters.
impl Send for Llama
impl Sync for Llama
Auto Trait Implementations§
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
source§impl<H, M> Model for Mwhere
H: Hyperparameters,
M: KnownModel<Hyperparameters = H>,
impl<H, M> Model for Mwhere H: Hyperparameters, M: KnownModel<Hyperparameters = H>,
source§fn start_session(&self, config: InferenceSessionConfig) -> InferenceSession
fn start_session(&self, config: InferenceSessionConfig) -> InferenceSession
Starts a new
InferenceSession
for this model.source§fn evaluate(
&self,
session: &mut InferenceSession,
params: &InferenceParameters,
input_tokens: &[i32],
output_request: &mut OutputRequest
)
fn evaluate( &self, session: &mut InferenceSession, params: &InferenceParameters, input_tokens: &[i32], output_request: &mut OutputRequest )
This function is called by the provided InferenceSession; it will use this model
and the InferenceParameters to generate output by evaluating the
input_tokens
.
The OutputRequest is used to specify additional data to fetch from the
model.source§fn vocabulary(&self) -> &Vocabulary
fn vocabulary(&self) -> &Vocabulary
Get the vocabulary (loaded from the GGML file) for this model.
source§fn n_context_tokens(&self) -> usize
fn n_context_tokens(&self) -> usize
Get the context size (configured with ModelParameters::n_context_tokens) used by
this model.
source§fn bot_token_id(&self) -> Option<i32>
fn bot_token_id(&self) -> Option<i32>
Get the beginning of text/beginning of string token ID, if available. This value is defined by model implementers.
source§fn eot_token_id(&self) -> i32
fn eot_token_id(&self) -> i32
Get the end of text/end of string token ID. This value is defined by model implementers.
source§fn inference_parameters(&self) -> &InferenceParameters
fn inference_parameters(&self) -> &InferenceParameters
Get the default InferenceParameters for this model (used by
InferenceSession::infer). This value is configured through
ModelParameters::inference_parameters.