Expand description
§MiniLLM - A Mini Transformer Inference Engine
A lightweight, efficient transformer inference engine written in Rust. Supports GPT-2 style models with multi-head attention, feed-forward networks, and layer normalization.
§Features
- Dynamic tensor operations with ndarray
- SafeTensors weight loading from HuggingFace
- Complete GPT-2 architecture implementation
- Text generation with autoregressive sampling
§Example
use minillm::inference::InferenceEngine;
let engine = InferenceEngine::new("openai-community/gpt2")?;
let result = engine.generate("Hello world", 10)?;
println!("Generated: {}", result);Re-exports§
pub use config::ModelConfig;pub use gpt::GPTModel;pub use inference::InferenceEngine;pub use tensor::Tensor;pub use weights::ModelWeights;
Modules§
- attention
- config
- gpt
- GPT-style transformer model implementation
- inference
- High-level inference engine for text generation
- mlp
- Multi-Layer Perceptron (Feed-Forward Network) implementation
- tensor
- Tensor operations for neural network computations
- transformer
- weights
Type Aliases§
- Result
- Result type used throughout the library