Skip to main content

Crate axonml_llm

Crate axonml_llm 

Source
Expand description

axonml-llm - Large Language Model Architectures

This crate provides implementations of popular transformer-based language models including BERT, GPT-2, LLaMA, Mistral, and Phi, along with building blocks for custom LLM architectures.

§Key Features

  • BERT (Bidirectional Encoder Representations from Transformers)
  • GPT-2 (Generative Pre-trained Transformer 2)
  • LLaMA (Large Language Model Meta AI) with RoPE and SwiGLU
  • Mistral with sliding window attention
  • Phi with partial rotary embeddings
  • KV-cache for efficient autoregressive generation
  • Transformer building blocks (attention, feed-forward, positional encoding)
  • Text generation utilities

§Example

use axonml_llm::{GPT2, GPT2Config};
use axonml_tensor::Tensor;

// Create a GPT-2 model
let config = GPT2Config::small();
let model = GPT2::new(&config);

// Generate text
let input_ids = Tensor::from_vec(vec![50256u32], &[1, 1]).unwrap();
let output = model.forward(&input_ids);

@version 0.2.0 @author AutomataNexus Development Team

Re-exports§

pub use error::LLMError;
pub use error::LLMResult;
pub use config::BertConfig;
pub use config::GPT2Config;
pub use config::TransformerConfig;
pub use attention::CausalSelfAttention;
pub use attention::FlashAttention;
pub use attention::FlashAttentionConfig;
pub use attention::KVCache;
pub use attention::LayerKVCache;
pub use attention::MultiHeadSelfAttention;
pub use attention::scaled_dot_product_attention;
pub use embedding::TokenEmbedding;
pub use embedding::PositionalEmbedding;
pub use embedding::BertEmbedding;
pub use embedding::GPT2Embedding;
pub use hub::PretrainedLLM;
pub use hub::llm_registry;
pub use hub::download_weights as download_llm_weights;
pub use hf_loader::HFLoader;
pub use hf_loader::load_llama_from_hf;
pub use hf_loader::load_mistral_from_hf;
pub use tokenizer::HFTokenizer;
pub use tokenizer::SpecialTokens;
pub use state_dict::LoadStateDict;
pub use state_dict::LoadResult;
pub use transformer::TransformerBlock;
pub use transformer::TransformerEncoder;
pub use transformer::TransformerDecoder;
pub use bert::Bert;
pub use bert::BertForSequenceClassification;
pub use bert::BertForMaskedLM;
pub use gpt2::GPT2;
pub use gpt2::GPT2LMHead;
pub use llama::LLaMA;
pub use llama::LLaMAConfig;
pub use llama::LLaMAForCausalLM;
pub use mistral::Mistral;
pub use mistral::MistralConfig;
pub use mistral::MistralForCausalLM;
pub use phi::Phi;
pub use phi::PhiConfig;
pub use phi::PhiForCausalLM;
pub use generation::GenerationConfig;
pub use generation::TextGenerator;

Modules§

attention
Attention Mechanisms Module
bert
BERT Model Implementation
config
Model Configuration Module
embedding
Embedding Module
error
Error types for the LLM module.
generation
Text Generation Utilities
gpt2
GPT-2 Model Implementation
hf_loader
HuggingFace Model Loader
hub
LLM Model Hub - Pretrained Language Model Weights
llama
LLaMA - Large Language Model Meta AI
mistral
Mistral - Efficient LLM Architecture
phi
Phi - Microsoft’s Small Language Models
state_dict
State Dictionary Loading
tokenizer
HuggingFace Tokenizer Support
transformer
Transformer Building Blocks