Crate minillm

Crate minillm 

Source
Expand description

§MiniLLM - A Mini Transformer Inference Engine

A lightweight, efficient transformer inference engine written in Rust. Supports GPT-2 style models with multi-head attention, feed-forward networks, and layer normalization.

§Features

  • Dynamic tensor operations with ndarray
  • SafeTensors weight loading from HuggingFace
  • Complete GPT-2 architecture implementation
  • Text generation with autoregressive sampling

§Example

use minillm::inference::InferenceEngine;
 
let engine = InferenceEngine::new("openai-community/gpt2")?;
let result = engine.generate("Hello world", 10)?;
println!("Generated: {}", result);

Re-exports§

pub use config::ModelConfig;
pub use gpt::GPTModel;
pub use inference::InferenceEngine;
pub use tensor::Tensor;
pub use weights::ModelWeights;

Modules§

attention
config
gpt
GPT-style transformer model implementation
inference
High-level inference engine for text generation
mlp
Multi-Layer Perceptron (Feed-Forward Network) implementation
tensor
Tensor operations for neural network computations
transformer
weights

Type Aliases§

Result
Result type used throughout the library