docs.rs failed to build pmetal-0.3.7
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
pmetal
Powdered Metal — High-performance LLM fine-tuning framework for Apple Silicon, written in Rust.
This is the umbrella crate that re-exports all PMetal sub-crates behind feature flags. Add a single dependency to access the full framework:
[]
= "0.3" # default features
= { = "0.3", = ["full"] } # everything
Quick Start
Fine-tune a model
async
Run inference
async
Query device info
Feature Flags
| Feature | Crate | Default | Description |
|---|---|---|---|
core |
pmetal-core |
yes | Foundation types, configs, traits |
gguf |
pmetal-gguf |
yes | GGUF format with imatrix quantization |
metal |
pmetal-metal |
yes | Custom Metal GPU kernels + ANE runtime |
hub |
pmetal-hub |
yes | HuggingFace Hub integration |
mlx |
pmetal-mlx |
yes | MLX backend (KV cache, RoPE, ops) |
models |
pmetal-models |
yes | LLM architectures (Llama, Qwen, DeepSeek, ...) |
lora |
pmetal-lora |
yes | LoRA/QLoRA training |
trainer |
pmetal-trainer |
yes | Training loops (SFT, DPO, GRPO, DAPO) |
easy |
all training/inference | yes | High-level builder API |
ane |
ANE integration | yes | Apple Neural Engine direct programming |
data |
pmetal-data |
no | Dataset loading and preprocessing |
distill |
pmetal-distill |
no | Knowledge distillation (cross-vocab) |
merge |
pmetal-merge |
no | Model merging (SLERP, TIES, DARE, ModelStock) |
vocoder |
pmetal-vocoder |
no | BigVGAN neural vocoder |
distributed |
pmetal-distributed |
no | Distributed training (mDNS, Ring All-Reduce) |
mhc |
pmetal-mhc |
no | Manifold-Constrained Hyper-Connections |
lora-metal-fused |
— | no | Fused Metal kernels for ~2x LoRA speedup |
full |
all of the above | no | Everything |
Hardware Support
PMetal auto-detects Apple Silicon capabilities and tunes kernel parameters per device:
- M1–M5 families (Base, Pro, Max, Ultra)
- NAX (Neural Accelerators in GPU) on M5/Apple10
- ANE (Apple Neural Engine) with CPU RMSNorm workaround for fp16 stability
- UltraFusion multi-die topology detection
- Tier-based tuning: FlashAttention block sizes, GEMM tile sizes, threadgroup sizes, batch multipliers
Examples
# Device info
# Fine-tuning (easy API)
# Inference (easy API)
# Manual fine-tuning (lower-level control)
Re-exports
All sub-crates are available as modules:
use core; // pmetal-core
use metal; // pmetal-metal
use mlx; // pmetal-mlx
use models; // pmetal-models
use lora; // pmetal-lora
use trainer; // pmetal-trainer
use hub; // pmetal-hub
use gguf; // pmetal-gguf
use *; // commonly used types from all crates
License
Licensed under either of MIT or Apache-2.0.