Crate axonml_llm

Expand description

Nine LLM architectures for the AxonML framework.

Complete pure-Rust implementations: GPT-2 (decoder-only), LLaMA (split- halves RoPE + GQA + SwiGLU), Mistral (sliding-window attention), Phi (partial RoPE + GELU), BERT (bidirectional encoder + classification/MLM), SSM/Mamba (selective S6 scan + depthwise conv + SSMForCausalLM), Hydra (hybrid SSM + windowed attention), Chimera (sparse MoE + differential attention), Trident (1.58-bit ternary TernaryLinear, RoPE + GQA + ReLU²-gated FFN + SubLN, graph-preserving RepeatKVBackward, configs for 1B/3B/smoke). Shared building blocks: attention, RMSNorm, RotaryEmbedding, embedding, text generation (top-k/top-p/temperature), HuggingFace weight loader, and pretrained model hub.

§File

crates/axonml-llm/src/lib.rs

§Author

Andrew Jewell Sr. — AutomataNexus LLC ORCID: 0009-0005-2158-7060

§Updated

April 14, 2026 11:15 PM EST

§Disclaimer

Use at own risk. This software is provided “as is”, without warranty of any kind, express or implied. The author and AutomataNexus shall not be held liable for any damages arising from the use of this software.

Re-exports§

pub use attention::CausalSelfAttention;
pub use attention::FlashAttention;
pub use attention::FlashAttentionConfig;
pub use attention::KVCache;
pub use attention::LayerKVCache;
pub use attention::MultiHeadSelfAttention;
pub use attention::scaled_dot_product_attention;
pub use bert::Bert;
pub use bert::BertForMaskedLM;
pub use bert::BertForSequenceClassification;
pub use chimera::ChimeraConfig;
pub use chimera::ChimeraModel;
pub use config::BertConfig;
pub use config::GPT2Config;
pub use config::TransformerConfig;
pub use embedding::BertEmbedding;
pub use embedding::GPT2Embedding;
pub use embedding::PositionalEmbedding;
pub use embedding::TokenEmbedding;
pub use error::LLMError;
pub use error::LLMResult;
pub use generation::GenerationConfig;
pub use generation::TextGenerator;
pub use gpt2::GPT2;
pub use gpt2::GPT2LMHead;
pub use hf_loader::HFLoader;
pub use hf_loader::load_llama_from_hf;
pub use hf_loader::load_mistral_from_hf;
pub use hub::PretrainedLLM;
pub use hub::download_weights as download_llm_weights;
pub use hub::llm_registry;
pub use hydra::HydraConfig;
pub use hydra::HydraModel;
pub use llama::LLaMA;
pub use llama::LLaMAConfig;
pub use llama::LLaMAForCausalLM;
pub use mistral::Mistral;
pub use mistral::MistralConfig;
pub use mistral::MistralForCausalLM;
pub use phi::Phi;
pub use phi::PhiConfig;
pub use phi::PhiForCausalLM;
pub use ssm::SSMBlock;
pub use ssm::SSMConfig;
pub use ssm::SSMForCausalLM;
pub use state_dict::LoadResult;
pub use state_dict::LoadStateDict;
pub use tokenizer::HFTokenizer;
pub use tokenizer::SpecialTokens;
pub use transformer::TransformerBlock;
pub use transformer::TransformerDecoder;
pub use transformer::TransformerEncoder;
pub use trident::TridentConfig;
pub use trident::TridentModel;

Modules§

attention: Attention Mechanisms Module
bert: BERT Model — Encoder-Only Transformer with Task Heads
chimera: Chimera - Mixture of Experts + Differential Attention Small Language Model
config: Model Configuration — Transformer, BERT, and GPT-2 Hyperparameters
embedding: Embedding Module — Token, Position, Segment, and Sinusoidal Embeddings
error: LLM Error Types — Failure Modes for Transformer Operations
generation: Text Generation — Decoding Strategies for Autoregressive LMs
gpt2: GPT-2 Model — Decoder-Only Transformer with LM Head
hf_loader: HuggingFace Model Loader
hub: LLM Model Hub - Pretrained Language Model Weights
hydra: Hydra - Hybrid SSM + Sparse Attention Small Language Model
llama: LLaMA - Large Language Model Meta AI
mistral: Mistral - Efficient LLM Architecture
phi: Phi - Microsoft’s Small Language Models
ssm: State Space Model (SSM) - Mamba-style Selective Scan
state_dict: State Dictionary Loading
tokenizer: HuggingFace Tokenizer Support
transformer: Transformer Building Blocks — Layer Norm, FFN, Encoder / Decoder Blocks
trident: Trident - 1.58-bit Ternary Weight Small Language Model

Crate axonml_llm

Crate axonml_llm Copy item path

§File

§Author

§Updated

§Disclaimer

Re-exports§

Modules§

Crate axonml_llm