docs.rs failed to build pmetal-mlx-0.1.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
pmetal-mlx
MLX backend integration with advanced training utilities.
Overview
This crate provides the bridge between PMetal and Apple's MLX framework, along with custom implementations for training utilities not available in the base MLX library.
Features
- Quantization: NF4, FP4, Int8 implementations
- Gradient Checkpointing: Memory-efficient training for large models
- KV Cache: Efficient key-value caching for inference
- Mixture of Experts: MoE layer implementations
- NEFTune: Noise injection for improved fine-tuning
- Sequence Packing: Efficient batching for variable-length sequences
- Speculative Decoding: Faster inference with draft models
Usage
use *;
// Create a KV cache for inference
let cache = new;
// Use sequence packing for training
let packed = pack?;
Modules
| Module | Description |
|---|---|
kernels |
Custom MLX kernels (cross entropy, RMS norm, etc.) |
quantization |
Weight quantization implementations |
gradient_checkpoint |
Memory-efficient gradient computation |
kv_cache |
Key-value cache for efficient inference |
moe |
Mixture of Experts support |
neftune |
NEFTune noise injection |
sequence_packing |
Efficient sequence batching |
speculative |
Speculative decoding utilities |
Quantization Formats
| Format | Bits | Memory Savings | Quality |
|---|---|---|---|
| NF4 | 4 | 75% | High |
| FP4 | 4 | 75% | Medium |
| Int8 | 8 | 50% | Very High |
License
MIT OR Apache-2.0