llm_models: Load and download LLM models, metadata, and tokenizers
The llm_models crate is a workspace member of the llm_client project.
Features
- GGUFs from local storage or Hugging Face
- Parses model metadata from GGUF file
- Includes limited support for tokenizer from GGUF file
- Also supports loading Metadata and Tokenizer from their respective files
- API models from OpenAI, Anthropic, and Perplexity
- Tokenizer abstraction for Hugging Face's Tokenizer and Tiktoken
LocalLlmModel
Everything you need for GGUF models. The GgufLoader
wraps the loaders for convenience.
All loaders return a LocalLlmModel
which contains the tokenizer, metadata, chat template,
and anything that can be extracted from the GGUF.
GgufPresetLoader
- Presets for popular models like Llama 3, Phi, Mistral/Mixtral, and more
- Loads the best quantized model by calculating the largest quant that will fit in your VRAM
let model: LocalLlmModel = default
.llama3_1_8b_instruct
.preset_with_available_vram_gb // Load the largest quant that will fit in your vram
.load?;
GgufHfLoader
GGUF models from Hugging Face.
let model: LocalLlmModel = default
.hf_quant_file_url
.load?;
GgufLocalLoader
GGUF models from local storage.
let model: LocalLlmModel = default
.local_quant_file_path
.load?;
ApiLlmModel
- Supports OpenAI, Anthropic, Perplexity, and adding your own API models
- Supports prompting, tokenization, and price estimation
assert_eq!
LlmTokenizer
- Simple abstract API for encoding and decoding allows for abstract LLM consumption across multiple architectures
- Uses Hugging Face's Tokenizer library for local models and Tiktoken-rs for OpenAI and Anthropic (Anthropic doesn't have a publicly available tokenizer)
// Get a Tiktoken tokenizer
let tok = new_tiktoken;
// From local path
let tok = new_from_tokenizer_json;
// From repo
let tok = new_from_hf_repo;
// From LocalLlmModel or ApiLlmModel
let tok = model.model_base.tokenizer;
Setter Traits
- All setter traits are public, so you can integrate into your own projects if you wish
- Examples include:
OpenAiModelTrait
,GgufLoaderTrait
,AnthropicModelTrait
, andHfTokenTrait
for loading models