End-to-end knowledge distillation CLI.
This crate provides a complete pipeline for knowledge distillation:
- Fetch teacher models from HuggingFace
- Configure distillation parameters via YAML
- Train student models with progressive/attention distillation
- Export to SafeTensors, GGUF, or APR formats
Toyota Way Principles
- Jidoka: Pre-flight validation catches errors before expensive training
- Heijunka: Memory estimation enables level scheduling of GPU resources
- Kaizen: Configurable hyperparameters enable continuous improvement