Expand description
axonml-llm - Large Language Model Architectures
This crate provides implementations of popular transformer-based language models including BERT, GPT-2, LLaMA, Mistral, and Phi, along with building blocks for custom LLM architectures.
§Key Features
- BERT (Bidirectional Encoder Representations from Transformers)
- GPT-2 (Generative Pre-trained Transformer 2)
- LLaMA (Large Language Model Meta AI) with RoPE and SwiGLU
- Mistral with sliding window attention
- Phi with partial rotary embeddings
- KV-cache for efficient autoregressive generation
- Transformer building blocks (attention, feed-forward, positional encoding)
- Text generation utilities
§Example
ⓘ
use axonml_llm::{GPT2, GPT2Config};
use axonml_tensor::Tensor;
// Create a GPT-2 model
let config = GPT2Config::small();
let model = GPT2::new(&config);
// Generate text
let input_ids = Tensor::from_vec(vec![50256u32], &[1, 1]).unwrap();
let output = model.forward(&input_ids);@version 0.2.0 @author AutomataNexus Development Team
Re-exports§
pub use error::LLMError;pub use error::LLMResult;pub use config::BertConfig;pub use config::GPT2Config;pub use config::TransformerConfig;pub use attention::CausalSelfAttention;pub use attention::FlashAttention;pub use attention::FlashAttentionConfig;pub use attention::KVCache;pub use attention::LayerKVCache;pub use attention::MultiHeadSelfAttention;pub use attention::scaled_dot_product_attention;pub use embedding::TokenEmbedding;pub use embedding::PositionalEmbedding;pub use embedding::BertEmbedding;pub use embedding::GPT2Embedding;pub use hub::PretrainedLLM;pub use hub::llm_registry;pub use hub::download_weights as download_llm_weights;pub use hf_loader::HFLoader;pub use hf_loader::load_llama_from_hf;pub use hf_loader::load_mistral_from_hf;pub use tokenizer::HFTokenizer;pub use tokenizer::SpecialTokens;pub use state_dict::LoadStateDict;pub use state_dict::LoadResult;pub use transformer::TransformerBlock;pub use transformer::TransformerEncoder;pub use transformer::TransformerDecoder;pub use bert::Bert;pub use bert::BertForSequenceClassification;pub use bert::BertForMaskedLM;pub use gpt2::GPT2;pub use gpt2::GPT2LMHead;pub use llama::LLaMA;pub use llama::LLaMAConfig;pub use llama::LLaMAForCausalLM;pub use mistral::Mistral;pub use mistral::MistralConfig;pub use mistral::MistralForCausalLM;pub use phi::Phi;pub use phi::PhiConfig;pub use phi::PhiForCausalLM;pub use generation::GenerationConfig;pub use generation::TextGenerator;
Modules§
- attention
- Attention Mechanisms Module
- bert
- BERT Model Implementation
- config
- Model Configuration Module
- embedding
- Embedding Module
- error
- Error types for the LLM module.
- generation
- Text Generation Utilities
- gpt2
- GPT-2 Model Implementation
- hf_
loader - HuggingFace Model Loader
- hub
- LLM Model Hub - Pretrained Language Model Weights
- llama
- LLaMA - Large Language Model Meta AI
- mistral
- Mistral - Efficient LLM Architecture
- phi
- Phi - Microsoft’s Small Language Models
- state_
dict - State Dictionary Loading
- tokenizer
- HuggingFace Tokenizer Support
- transformer
- Transformer Building Blocks