Features
- Quantization
bitsandbytes
format (fp4, nf4, and int8)GGUF
(2-8 bit quantization)
- Easy: Strong support for running 🤗 DDUF models.
- Strong Apple Silicon support: support for the Metal, Accelerate, and ARM NEON frameworks
- Support for NVIDIA GPUs with CUDA
- AVX support for x86 CPUs
- Allow acceleration of models larger than the total VRAM size with offloading
Please do not hesitate to contact us with feature requests via Github issues!
Upcoming features
- 🚧 LoRA support
- 🚧 CPU + GPU inference with automatic offloading to allow partial acceleration of models larger than the total VRAM
Installation
Check out the installation guide for details about installation.
Examples
After installing, you can try out these examples!
Download the DDUF file here:
wget https://huggingface.co/DDUF/FLUX.1-dev-DDUF/resolve/main/FLUX.1-dev-Q4-bnb.dduf
CLI:
More CLI examples here.
Python:
More Python examples here.
=
=
=
Rust crate:
Examples with the Rust crate: here.
use Instant;
use ;
use LevelFilter;
use EnvFilter;
let filter = builder
.with_default_directive
.from_env_lossy;
fmt.with_env_filter.init;
let pipeline = load?;
let start = now;
let images = pipeline.forward?;
let end = now;
println!;
images.save?;
Support matrix
Model | Supports DDUF | Supports quantized DDUF |
---|---|---|
FLUX.1 Dev/Schnell | ✅ | ✅ |
Contributing
- Anyone is welcome to contribute by opening PRs
- See good first issues for a starting point!
- Collaborators will be invited based on past contributions