pub struct LlamaModel { /* private fields */ }Expand description
A safe wrapper around llama_model.
Implementations§
Source§impl LlamaModel
impl LlamaModel
Sourcepub fn get_vocab(&self) -> LlamaVocab
pub fn get_vocab(&self) -> LlamaVocab
Retrieves the vocabulary associated with the current Llama model.
This method fetches the vocabulary from the underlying model using an unsafe
FFI call. The returned LlamaVocab struct contains a non-null pointer to
the vocabulary data, which is wrapped in a NonNull for safety.
§Safety
This method uses an unsafe block to call a C function (llama_model_get_vocab),
which is assumed to return a valid pointer to the vocabulary. The caller should
ensure that the model object is properly initialized and valid before calling
this method, as dereferencing invalid pointers can lead to undefined behavior.
§Returns
A LlamaVocab struct containing the vocabulary of the model.
§Panics
Panics if the underlying C function returns a null pointer.
§Example
let vocab = model.get_vocab();Sourcepub fn n_ctx_train(&self) -> u32
pub fn n_ctx_train(&self) -> u32
Get the number of tokens the model was trained on.
This function returns the number of tokens that the model was trained on, represented as a u32.
§Panics
This function will panic if the number of tokens the model was trained on does not fit into a u32.
This should be impossible on most platforms since llama.cpp returns a c_int (i32 on most platforms),
which is almost certainly positive.
Sourcepub fn tokens(
&self,
special: Special,
) -> impl Iterator<Item = (LlamaToken, Result<String, TokenToStringError>)> + '_
pub fn tokens( &self, special: Special, ) -> impl Iterator<Item = (LlamaToken, Result<String, TokenToStringError>)> + '_
Get all tokens in the model.
This function returns an iterator over all the tokens in the model. Each item in the iterator is a tuple
containing a LlamaToken and its corresponding string representation (or an error if the conversion fails).
§Parameters
special: TheSpecialvalue that determines how special tokens (like BOS, EOS, etc.) are handled.
Sourcepub fn token_bos(&self) -> LlamaToken
pub fn token_bos(&self) -> LlamaToken
Get the beginning of stream token.
This function returns the token that represents the beginning of a stream (BOS token).
Sourcepub fn token_eos(&self) -> LlamaToken
pub fn token_eos(&self) -> LlamaToken
Get the end of stream token.
This function returns the token that represents the end of a stream (EOS token).
Sourcepub fn token_nl(&self) -> LlamaToken
pub fn token_nl(&self) -> LlamaToken
Get the newline token.
This function returns the token that represents a newline character.
Sourcepub fn is_eog_token(&self, token: LlamaToken) -> bool
pub fn is_eog_token(&self, token: LlamaToken) -> bool
Check if a token represents the end of generation (end of turn, end of sequence, etc.).
This function returns true if the provided token signifies the end of generation or end of sequence,
such as EOS or other special tokens.
§Parameters
token: TheLlamaTokento check.
§Returns
trueif the token is an end-of-generation token, otherwisefalse.
Sourcepub fn token_cls(&self) -> LlamaToken
pub fn token_cls(&self) -> LlamaToken
Get the classification token.
Sourcepub fn token_eot(&self) -> LlamaToken
pub fn token_eot(&self) -> LlamaToken
Get the end-of-turn token.
Sourcepub fn token_pad(&self) -> LlamaToken
pub fn token_pad(&self) -> LlamaToken
Get the padding token.
Sourcepub fn token_sep(&self) -> LlamaToken
pub fn token_sep(&self) -> LlamaToken
Get the separator token.
Sourcepub fn token_fim_pre(&self) -> LlamaToken
pub fn token_fim_pre(&self) -> LlamaToken
Get the fill-in-the-middle prefix token.
Sourcepub fn token_fim_suf(&self) -> LlamaToken
pub fn token_fim_suf(&self) -> LlamaToken
Get the fill-in-the-middle suffix token.
Sourcepub fn token_fim_mid(&self) -> LlamaToken
pub fn token_fim_mid(&self) -> LlamaToken
Get the fill-in-the-middle middle token.
Sourcepub fn token_fim_pad(&self) -> LlamaToken
pub fn token_fim_pad(&self) -> LlamaToken
Get the fill-in-the-middle padding token.
Sourcepub fn token_fim_rep(&self) -> LlamaToken
pub fn token_fim_rep(&self) -> LlamaToken
Get the fill-in-the-middle repository token.
Sourcepub fn token_fim_sep(&self) -> LlamaToken
pub fn token_fim_sep(&self) -> LlamaToken
Get the fill-in-the-middle separator token.
Sourcepub fn token_is_control(&self, token: LlamaToken) -> bool
pub fn token_is_control(&self, token: LlamaToken) -> bool
Check if a token is a control token.
Sourcepub fn token_get_score(&self, token: LlamaToken) -> f32
pub fn token_get_score(&self, token: LlamaToken) -> f32
Get the score of a token.
Sourcepub fn token_get_text(
&self,
token: LlamaToken,
) -> Result<&str, StringFromModelError>
pub fn token_get_text( &self, token: LlamaToken, ) -> Result<&str, StringFromModelError>
Sourcepub fn add_bos_token(&self) -> bool
pub fn add_bos_token(&self) -> bool
Check if a BOS token should be added when tokenizing.
Sourcepub fn add_eos_token(&self) -> bool
pub fn add_eos_token(&self) -> bool
Check if an EOS token should be added when tokenizing.
Sourcepub fn decode_start_token(&self) -> LlamaToken
pub fn decode_start_token(&self) -> LlamaToken
Get the decoder start token.
This function returns the token used to signal the start of decoding (i.e., the token used at the start of a sequence generation).
Sourcepub fn token_to_str(
&self,
token: LlamaToken,
special: Special,
) -> Result<String, TokenToStringError>
pub fn token_to_str( &self, token: LlamaToken, special: Special, ) -> Result<String, TokenToStringError>
Convert a single token to a string.
This function converts a LlamaToken into its string representation.
§Errors
This function returns an error if the token cannot be converted to a string. For more details, refer to
TokenToStringError.
§Parameters
token: TheLlamaTokento convert.special: TheSpecialvalue used to handle special tokens.
Sourcepub fn token_to_bytes(
&self,
token: LlamaToken,
special: Special,
) -> Result<Vec<u8>, TokenToStringError>
pub fn token_to_bytes( &self, token: LlamaToken, special: Special, ) -> Result<Vec<u8>, TokenToStringError>
Convert a single token to bytes.
This function converts a LlamaToken into a byte representation.
§Errors
This function returns an error if the token cannot be converted to bytes. For more details, refer to
TokenToStringError.
§Parameters
token: TheLlamaTokento convert.special: TheSpecialvalue used to handle special tokens.
Sourcepub fn tokens_to_str(
&self,
tokens: &[LlamaToken],
special: Special,
) -> Result<String, TokenToStringError>
pub fn tokens_to_str( &self, tokens: &[LlamaToken], special: Special, ) -> Result<String, TokenToStringError>
Convert a vector of tokens to a single string.
This function takes a slice of LlamaTokens and converts them into a single string, concatenating their
string representations.
§Errors
This function returns an error if any token cannot be converted to a string. For more details, refer to
TokenToStringError.
§Parameters
tokens: A slice ofLlamaTokens to convert.special: TheSpecialvalue used to handle special tokens.
Sourcepub fn str_to_token(
&self,
str: &str,
add_bos: AddBos,
) -> Result<Vec<LlamaToken>, StringToTokenError>
pub fn str_to_token( &self, str: &str, add_bos: AddBos, ) -> Result<Vec<LlamaToken>, StringToTokenError>
Convert a string to a vector of tokens.
This function converts a string into a vector of LlamaTokens. The function will tokenize the string
and return the corresponding tokens.
§Errors
- This function will return an error if the input string contains a null byte.
§Panics
- This function will panic if the number of tokens exceeds
usize::MAX.
§Example
use llama_cpp_4::model::LlamaModel;
use std::path::Path;
use llama_cpp_4::model::AddBos;
let backend = llama_cpp_4::llama_backend::LlamaBackend::init()?;
let model = LlamaModel::load_from_file(&backend, Path::new("path/to/model"), &Default::default())?;
let tokens = model.str_to_token("Hello, World!", AddBos::Always)?;Sourcepub fn token_attr(&self, LlamaToken: LlamaToken) -> LlamaTokenAttrs
pub fn token_attr(&self, LlamaToken: LlamaToken) -> LlamaTokenAttrs
Get the type of a token.
This function retrieves the attributes associated with a given token. The attributes are typically used to understand whether the token represents a special type of token (e.g., beginning-of-sequence (BOS), end-of-sequence (EOS), control tokens, etc.).
§Panics
- This function will panic if the token type is unknown or cannot be converted to a valid
LlamaTokenAttrs.
§Example
use llama_cpp_4::model::{LlamaModel, LlamaToken};
let model = LlamaModel::load_from_file("path/to/model")?;
let token = LlamaToken(42);
let token_attrs = model.token_attr(token);Sourcepub fn detokenize(
&self,
tokens: &[LlamaToken],
remove_special: bool,
unparse_special: bool,
) -> Result<String, StringFromModelError>
pub fn detokenize( &self, tokens: &[LlamaToken], remove_special: bool, unparse_special: bool, ) -> Result<String, StringFromModelError>
Detokenize a slice of tokens into a string.
This is the inverse of str_to_token.
§Parameters
tokens: The tokens to detokenize.remove_special: Iftrue, special tokens are removed from the output.unparse_special: Iftrue, special tokens are rendered as their text representation.
§Errors
Returns an error if the detokenized text is not valid UTF-8.
Sourcepub fn token_to_str_with_size(
&self,
token: LlamaToken,
buffer_size: usize,
special: Special,
) -> Result<String, TokenToStringError>
pub fn token_to_str_with_size( &self, token: LlamaToken, buffer_size: usize, special: Special, ) -> Result<String, TokenToStringError>
Convert a token to a string with a specified buffer size.
This function allows you to convert a token into a string, with the ability to specify a buffer size for the operation.
It is generally recommended to use LlamaModel::token_to_str instead, as 8 bytes is typically sufficient for most tokens,
and the extra buffer size doesn’t usually matter.
§Errors
- If the token type is unknown, an error will be returned.
- If the resultant token exceeds the provided
buffer_size, an error will occur. - If the token string returned by
llama-cppis not valid UTF-8, it will return an error.
§Panics
- This function will panic if the
buffer_sizedoes not fit into ac_int. - It will also panic if the size returned from
llama-cppdoes not fit into ausize, which should typically never happen.
§Example
use llama_cpp_4::model::{LlamaModel, LlamaToken};
let model = LlamaModel::load_from_file("path/to/model")?;
let token = LlamaToken(42);
let token_string = model.token_to_str_with_size(token, 32, Special::Plaintext)?;Sourcepub fn token_to_bytes_with_size(
&self,
token: LlamaToken,
buffer_size: usize,
special: Special,
lstrip: Option<NonZeroU16>,
) -> Result<Vec<u8>, TokenToStringError>
pub fn token_to_bytes_with_size( &self, token: LlamaToken, buffer_size: usize, special: Special, lstrip: Option<NonZeroU16>, ) -> Result<Vec<u8>, TokenToStringError>
Convert a token to bytes with a specified buffer size.
Generally you should use LlamaModel::token_to_bytes instead as 8 bytes is enough for most words and
the extra bytes do not really matter.
§Errors
- if the token type is unknown
- the resultant token is larger than
buffer_size.
§Panics
- This function will panic if
buffer_sizecannot fit into ac_int. - It will also panic if the size returned from
llama-cppcannot be converted tousize(which should not happen).
§Example
use llama_cpp_4::model::{LlamaModel, LlamaToken};
let model = LlamaModel::load_from_file("path/to/model")?;
let token = LlamaToken(42);
let token_bytes = model.token_to_bytes_with_size(token, 32, Special::Plaintext, None)?;Sourcepub fn n_vocab(&self) -> i32
pub fn n_vocab(&self) -> i32
The number of tokens the model was trained on.
This function returns the number of tokens the model was trained on. It is returned as a c_int for maximum
compatibility with the underlying llama-cpp library, though it can typically be cast to an i32 without issue.
§Example
use llama_cpp_4::model::LlamaModel;
let model = LlamaModel::load_from_file("path/to/model")?;
let n_vocab = model.n_vocab();Sourcepub fn vocab_type(&self) -> VocabType
pub fn vocab_type(&self) -> VocabType
The type of vocab the model was trained on.
This function returns the type of vocabulary used by the model, such as whether it is based on byte-pair encoding (BPE), word-level tokens, or another tokenization scheme.
§Panics
- This function will panic if
llama-cppemits a vocab type that is not recognized or is invalid for this library.
§Example
use llama_cpp_4::model::LlamaModel;
let model = LlamaModel::load_from_file("path/to/model")?;
let vocab_type = model.vocab_type();Sourcepub fn n_embd(&self) -> c_int
pub fn n_embd(&self) -> c_int
Returns the number of embedding dimensions for the model.
This function retrieves the number of embeddings (or embedding dimensions) used by the model. It is typically used for analyzing model architecture and setting up context parameters or other model configuration aspects.
§Panics
- This function may panic if the underlying
llama-cpplibrary returns an invalid embedding dimension value.
§Example
use llama_cpp_4::model::LlamaModel;
let model = LlamaModel::load_from_file("path/to/model")?;
let n_embd = model.n_embd();Sourcepub fn n_embd_inp(&self) -> c_int
pub fn n_embd_inp(&self) -> c_int
Get the input embedding size of the model.
Sourcepub fn n_embd_out(&self) -> c_int
pub fn n_embd_out(&self) -> c_int
Get the output embedding size of the model.
Sourcepub fn n_swa(&self) -> c_int
pub fn n_swa(&self) -> c_int
Get the sliding window attention size of the model. Returns 0 if the model does not use sliding window attention.
Sourcepub fn rope_freq_scale_train(&self) -> f32
pub fn rope_freq_scale_train(&self) -> f32
Get the RoPE frequency scale used during training.
Sourcepub fn model_size(&self) -> u64
pub fn model_size(&self) -> u64
Get the model size in bytes.
Sourcepub fn cls_label(&self, index: u32) -> Result<&str, StringFromModelError>
pub fn cls_label(&self, index: u32) -> Result<&str, StringFromModelError>
Get the classification label for the given index.
§Errors
Returns an error if the label is null or not valid UTF-8.
Sourcepub fn meta_count(&self) -> c_int
pub fn meta_count(&self) -> c_int
Get the number of metadata key-value pairs.
Sourcepub fn desc(&self, buf_size: usize) -> Result<String, StringFromModelError>
pub fn desc(&self, buf_size: usize) -> Result<String, StringFromModelError>
Get a model description string.
The buf_size parameter specifies the maximum buffer size for the description.
A default of 256 bytes is usually sufficient.
§Errors
Returns an error if the description could not be retrieved or is not valid UTF-8.
Sourcepub fn meta_key_by_index(
&self,
index: i32,
buf_size: usize,
) -> Result<String, StringFromModelError>
pub fn meta_key_by_index( &self, index: i32, buf_size: usize, ) -> Result<String, StringFromModelError>
Get a metadata key by index.
The buf_size parameter specifies the maximum buffer size for the key.
A default of 256 bytes is usually sufficient.
§Errors
Returns an error if the index is out of range or the key is not valid UTF-8.
Sourcepub fn meta_val_str_by_index(
&self,
index: i32,
buf_size: usize,
) -> Result<String, StringFromModelError>
pub fn meta_val_str_by_index( &self, index: i32, buf_size: usize, ) -> Result<String, StringFromModelError>
Get a metadata value string by index.
The buf_size parameter specifies the maximum buffer size for the value.
Values can be large (e.g. chat templates, token lists), so 4096+ may be needed.
§Errors
Returns an error if the index is out of range or the value is not valid UTF-8.
Sourcepub fn meta_val_str(
&self,
key: &str,
buf_size: usize,
) -> Result<String, StringFromModelError>
pub fn meta_val_str( &self, key: &str, buf_size: usize, ) -> Result<String, StringFromModelError>
Get a metadata value by key name.
This is more convenient than iterating metadata by index when you know the key.
The buf_size parameter specifies the maximum buffer size for the value.
§Errors
Returns an error if the key is not found, contains a null byte, or the value is not valid UTF-8.
Sourcepub fn metadata(&self) -> Result<Vec<(String, String)>, StringFromModelError>
pub fn metadata(&self) -> Result<Vec<(String, String)>, StringFromModelError>
Get all metadata as a list of (key, value) pairs.
This is a convenience method that iterates over all metadata entries.
Keys use a buffer of 256 bytes and values use 4096 bytes.
For values that may be larger (e.g. token lists), use
meta_val_str_by_index directly with a larger buffer.
§Errors
Returns an error if any key or value cannot be read or is not valid UTF-8.
Sourcepub fn has_encoder(&self) -> bool
pub fn has_encoder(&self) -> bool
Check if the model has an encoder.
Sourcepub fn has_decoder(&self) -> bool
pub fn has_decoder(&self) -> bool
Check if the model has a decoder.
Sourcepub fn is_recurrent(&self) -> bool
pub fn is_recurrent(&self) -> bool
Check if the model is recurrent (e.g. Mamba, RWKV).
Sourcepub fn is_diffusion(&self) -> bool
pub fn is_diffusion(&self) -> bool
Check if the model is a diffusion model.
Sourcepub fn get_chat_template(
&self,
buf_size: usize,
) -> Result<String, ChatTemplateError>
pub fn get_chat_template( &self, buf_size: usize, ) -> Result<String, ChatTemplateError>
Get chat template from model.
§Errors
- If the model does not have a chat template, it will return an error.
- If the chat template is not a valid
CString, it will return an error.
§Example
use llama_cpp_4::model::LlamaModel;
let model = LlamaModel::load_from_file("path/to/model")?;
let chat_template = model.get_chat_template(1024)?;Sourcepub fn load_from_file(
_: &LlamaBackend,
path: impl AsRef<Path>,
params: &LlamaModelParams,
) -> Result<Self, LlamaModelLoadError>
pub fn load_from_file( _: &LlamaBackend, path: impl AsRef<Path>, params: &LlamaModelParams, ) -> Result<Self, LlamaModelLoadError>
Loads a model from a file.
This function loads a model from a specified file path and returns the corresponding LlamaModel instance.
§Errors
- If the path cannot be converted to a string or if the model file does not exist, it will return an error.
- If the model cannot be loaded (e.g., due to an invalid or corrupted model file), it will return a
LlamaModelLoadError.
§Example
use llama_cpp_4::model::LlamaModel;
use std::path::Path;
let model = LlamaModel::load_from_file("path/to/model", &LlamaModelParams::default())?;Sourcepub fn load_from_splits(
_: &LlamaBackend,
paths: &[impl AsRef<Path>],
params: &LlamaModelParams,
) -> Result<Self, LlamaModelLoadError>
pub fn load_from_splits( _: &LlamaBackend, paths: &[impl AsRef<Path>], params: &LlamaModelParams, ) -> Result<Self, LlamaModelLoadError>
Load a model from multiple split files.
This function loads a model that has been split across multiple files. This is useful for very large models that exceed filesystem limitations or need to be distributed across multiple storage devices.
§Arguments
paths- A slice of paths to the split model filesparams- The model parameters
§Errors
Returns an error if:
- Any of the paths cannot be converted to a C string
- The model fails to load from the splits
- Any path doesn’t exist or isn’t accessible
§Example
use llama_cpp_4::model::{LlamaModel, params::LlamaModelParams};
use llama_cpp_4::llama_backend::LlamaBackend;
let backend = LlamaBackend::init()?;
let params = LlamaModelParams::default();
let paths = vec![
"model-00001-of-00003.gguf",
"model-00002-of-00003.gguf",
"model-00003-of-00003.gguf",
];
let model = LlamaModel::load_from_splits(&backend, &paths, ¶ms)?;Sourcepub unsafe fn load_from_file_ptr(
file: *mut FILE,
params: &LlamaModelParams,
) -> Result<Self, LlamaModelLoadError>
pub unsafe fn load_from_file_ptr( file: *mut FILE, params: &LlamaModelParams, ) -> Result<Self, LlamaModelLoadError>
Sourcepub unsafe fn init_from_user(
metadata: *mut gguf_context,
set_tensor_data: llama_model_set_tensor_data_t,
set_tensor_data_ud: *mut c_void,
params: &LlamaModelParams,
) -> Result<Self, LlamaModelLoadError>
pub unsafe fn init_from_user( metadata: *mut gguf_context, set_tensor_data: llama_model_set_tensor_data_t, set_tensor_data_ud: *mut c_void, params: &LlamaModelParams, ) -> Result<Self, LlamaModelLoadError>
Sourcepub fn save_to_file(&self, path: impl AsRef<Path>)
pub fn save_to_file(&self, path: impl AsRef<Path>)
Sourcepub fn chat_builtin_templates() -> Vec<String>
pub fn chat_builtin_templates() -> Vec<String>
Get the list of built-in chat templates.
Returns the names of all chat templates that are built into llama.cpp.
§Panics
Panics if any template name is not valid UTF-8.
Sourcepub fn lora_adapter_init(
&self,
path: impl AsRef<Path>,
) -> Result<LlamaLoraAdapter, LlamaLoraAdapterInitError>
pub fn lora_adapter_init( &self, path: impl AsRef<Path>, ) -> Result<LlamaLoraAdapter, LlamaLoraAdapterInitError>
Initializes a lora adapter from a file.
This function initializes a Lora adapter, which is a model extension used to adapt or fine-tune the existing model to a specific domain or task. The adapter file is typically in the form of a binary or serialized file that can be applied to the model for improved performance on specialized tasks.
§Errors
- If the adapter file path cannot be converted to a string or if the adapter cannot be initialized, it will return an error.
§Example
use llama_cpp_4::model::{LlamaModel, LlamaLoraAdapter};
use std::path::Path;
let model = LlamaModel::load_from_file("path/to/model", &LlamaModelParams::default())?;
let adapter = model.lora_adapter_init("path/to/lora/adapter")?;Sourcepub fn new_context(
&self,
_: &LlamaBackend,
params: LlamaContextParams,
) -> Result<LlamaContext<'_>, LlamaContextLoadError>
pub fn new_context( &self, _: &LlamaBackend, params: LlamaContextParams, ) -> Result<LlamaContext<'_>, LlamaContextLoadError>
Create a new context from this model.
This function creates a new context for the model, which is used to manage and perform computations for inference, including token generation, embeddings, and other tasks that the model can perform. The context allows fine-grained control over model parameters for a specific task.
§Errors
- There are various potential failures such as invalid parameters or a failure to allocate the context. See
LlamaContextLoadErrorfor more detailed error descriptions.
§Example
use llama_cpp_4::model::{LlamaModel, LlamaContext};
use llama_cpp_4::LlamaContextParams;
let model = LlamaModel::load_from_file("path/to/model", &LlamaModelParams::default())?;
let context = model.new_context(&LlamaBackend::init()?, LlamaContextParams::default())?;Sourcepub fn apply_chat_template(
&self,
tmpl: Option<&str>,
chat: &[LlamaChatMessage],
add_ass: bool,
) -> Result<String, ApplyChatTemplateError>
pub fn apply_chat_template( &self, tmpl: Option<&str>, chat: &[LlamaChatMessage], add_ass: bool, ) -> Result<String, ApplyChatTemplateError>
Apply the model’s chat template to a sequence of messages.
This function applies the model’s chat template to the provided chat messages, formatting them accordingly. The chat
template determines the structure or style of conversation between the system and user, such as token formatting,
role separation, and more. The template can be customized by providing an optional template string, or if None
is provided, the default template used by llama.cpp will be applied.
For more information on supported templates, visit: https://github.com/ggerganov/llama.cpp/wiki/Templates-supported-by-llama_chat_apply_template
§Arguments
tmpl: An optional custom template string. IfNone, the default template will be used.chat: A vector ofLlamaChatMessageinstances, which represent the conversation between the system and user.add_ass: A boolean flag indicating whether additional system-specific instructions (like “assistant”) should be included.
§Errors
There are several possible points of failure when applying the chat template:
- Insufficient buffer size to hold the formatted chat (this will return
ApplyChatTemplateError::BuffSizeError). - If the template or messages cannot be processed properly, various errors from
ApplyChatTemplateErrormay occur.
§Example
use llama_cpp_4::model::{LlamaModel, LlamaChatMessage};
let model = LlamaModel::load_from_file("path/to/model", &LlamaModelParams::default())?;
let chat = vec![
LlamaChatMessage::new("user", "Hello!"),
LlamaChatMessage::new("assistant", "Hi! How can I assist you today?"),
];
let formatted_chat = model.apply_chat_template(None, chat, true)?;§Notes
The provided buffer is twice the length of the messages by default, which is recommended by the llama.cpp documentation.
§Panics
Panics if the buffer length exceeds i32::MAX.
Sourcepub fn split_path(path_prefix: &str, split_no: i32, split_count: i32) -> String
pub fn split_path(path_prefix: &str, split_no: i32, split_count: i32) -> String
Build a split GGUF file path for a specific chunk.
This utility function creates the standardized filename for a split model chunk
following the pattern: {prefix}-{split_no:05d}-of-{split_count:05d}.gguf
§Arguments
path_prefix- The base path and filename prefixsplit_no- The split number (1-indexed)split_count- The total number of splits
§Returns
Returns the formatted split path as a String
§Example
use llama_cpp_4::model::LlamaModel;
let path = LlamaModel::split_path("/models/llama", 2, 4);
assert_eq!(path, "/models/llama-00002-of-00004.gguf");§Panics
Panics if the path prefix contains a null byte.
Sourcepub fn split_prefix(
split_path: &str,
split_no: i32,
split_count: i32,
) -> Option<String>
pub fn split_prefix( split_path: &str, split_no: i32, split_count: i32, ) -> Option<String>
Extract the path prefix from a split filename.
This function extracts the base path prefix from a split model filename,
but only if the split_no and split_count match the pattern in the filename.
§Arguments
split_path- The full path to the split filesplit_no- The expected split numbersplit_count- The expected total number of splits
§Returns
Returns the path prefix if the pattern matches, or None if it doesn’t
§Example
use llama_cpp_4::model::LlamaModel;
let prefix = LlamaModel::split_prefix("/models/llama-00002-of-00004.gguf", 2, 4);
assert_eq!(prefix, Some("/models/llama".to_string()));§Panics
Panics if the split path contains a null byte.