Expand description
metal-candle: Production-quality Rust ML for Apple Silicon
metal-candle is a machine learning library built on Candle
with Metal backend, providing LoRA training, model loading, and text generation
for transformer models on Apple Silicon.
§Features
LoRATraining: Fine-tune transformer models efficiently using Low-Rank Adaptation- Model Loading: Support for safetensors format with extensibility for others
- Text Generation: High-level
GeneratorAPI with streaming, repetition penalty, and stop conditions - Sampling Strategies: Greedy, Top-k, Top-p (nucleus), and Temperature sampling
- Metal Acceleration: Native Metal backend for optimal Apple Silicon performance
§Examples
§Text Generation
use metal_candle::inference::{GeneratorConfig, SamplingStrategy};
// Configure generation
let gen_config = GeneratorConfig {
max_tokens: 128,
sampling: SamplingStrategy::TopP { p: 0.95 },
repetition_penalty: 1.1,
..Default::default()
};
// With a loaded model, you would use:
// let mut generator = Generator::new(Box::new(model), gen_config)?;
// let output = generator.generate(&input_ids)?;§Project Status
v1.1.0: Production-ready text generation API with comprehensive testing and documentation.
Re-exports§
pub use backend::Device;pub use backend::DeviceInfo;pub use backend::DeviceType;pub use backend::TensorExt;pub use error::Error;pub use error::Result;pub use inference::apply_repetition_penalty;pub use inference::sample_token;pub use inference::Generator;pub use inference::GeneratorConfig;pub use inference::KVCache;pub use inference::KVCacheConfig;pub use inference::SamplingStrategy;pub use training::cross_entropy_loss;pub use training::cross_entropy_loss_with_smoothing;pub use training::AdamW;pub use training::AdamWConfig;pub use training::LRScheduler;pub use training::LoRAAdapter;pub use training::LoRAAdapterConfig;pub use training::LoRAConfig;pub use training::LoRALayer;pub use training::StepMetrics;pub use training::TargetModule;pub use training::Trainer;pub use training::TrainingConfig;pub use training::TrainingStep;
Modules§
- backend
- Backend abstraction layer for Metal device operations.
- embeddings
- Sentence-transformer embeddings for semantic search and RAG.
- error
- Error types for metal-candle.
- graph
- Computation graph for lazy evaluation.
- inference
- Inference and text generation for transformer models.
- models
- Model loading and architecture implementations.
- training
- Training utilities for
LoRAfine-tuning.
Constants§
- VERSION
- Current version of the crate