llm_models: Load and Download LLM Models, Metadata, and Tokenizers
This crate is part of the llm_client crate.
- GGUFs from local storage or Hugging Face
- Parses model metadata from GGUF file
- Includes limited support for tokenizer from GGUF file
- Also supports loading Metadata and Tokenizer from their respective files
LocalLlmModel
Everything you need for GGUF models. The GgugLoader
wraps the loaders for convience. All loaders return a LocalLlmModel
which contains the tokenizer, metadata, chat template, and anything that can be extract from the GGUF.
GgufPresetLoader
- Presets for popular models like Llama 3, Phi, Mistral/Mixtral, and more
- Loads the best quantized model by calculating the largest quant that will fit in your VRAM
let model: LocalLlmModel = default
.llama3_1_8b_instruct
.preset_with_available_vram_gb // Load the largest quant that will fit in your vram
.load?;
GgufHfLoader
GGUF models from Hugging Face.
let model: LocalLlmModel = default
.hf_quant_file_url
.load?;
GgufLocalLoader
GGUF models for local storage.
let model: LocalLlmModel = default
.local_quant_file_path
.load?;
ApiLlmModel
- Supports openai, anthropic, perplexity, and adding your own API models
- Supports prompting, tokenization, and price estimation
assert_eq!
LlmTokenizer
- Simple abstract API for encoding and decoding allows for abstract LLM consumption across multiple architechtures. *Hugging Face's Tokenizer library for local models and Tiktoken-rs for OpenAI and Anthropic (Anthropic doesn't have a publically available tokenizer.)
let tok = new_tiktoken; // Get a Tiktoken tokenizer
let tok = new_from_tokenizer_json; // From local path
let tok = new_from_hf_repo; // From repo
// From LocalLlmModel or ApiLlmModel
let tok = model.model_base.tokenizer;
Setter Traits
- All setter traits are public, so you can integrate into your own projects if you wish.
- For example:
OpenAiModelTrait
,GgufLoaderTrait
,AnthropicModelTrait
, andHfTokenTrait
for loading models