mold
Generate images from text on your own GPU. No cloud, no Python, no fuss.
That's it. Mold auto-downloads the model on first run and saves the image to your current directory.
Install
Nix (recommended)
# Run directly — no install needed
# Or add to your system
From source
Usage
# Generate an image
# Pick a model
# Reproducible results (the logo above was generated this way)
# Custom size and steps
Piping
Mold is pipe-friendly in both directions. When stdout is not a terminal, raw image bytes go to stdout and status/progress goes to stderr.
# Pipe output to an image viewer
|
# Pipe prompt from stdin
|
# Chain with other tools
| |
# Pipe in and out
| |
Manage models
Remote rendering
Run mold on a beefy GPU server, generate from anywhere:
# On your GPU server
# From your laptop
MOLD_HOST=http://gpu-server:7680
Models
FLUX (best quality)
| Model | Steps | Size | Good for |
|---|---|---|---|
flux-schnell:q8 |
4 | 12GB | Fast, general purpose |
flux-schnell:q4 |
4 | 7.5GB | Same but lighter |
flux-dev:q8 |
25 | 12GB | Full quality |
flux-dev:q4 |
25 | 7GB | Full quality, less VRAM |
flux-krea:q8 |
25 | 12.7GB | Aesthetic photography |
SDXL (fast + flexible)
| Model | Steps | Size | Good for |
|---|---|---|---|
sdxl-turbo:fp16 |
4 | 5.1GB | Ultra-fast, 1-4 steps |
dreamshaper-xl:fp16 |
8 | 5.1GB | Fantasy, concept art |
juggernaut-xl:fp16 |
30 | 5.1GB | Photorealism, cinematic |
realvis-xl:fp16 |
25 | 5.1GB | Photorealism, versatile |
playground-v2.5:fp16 |
25 | 5.1GB | Artistic, aesthetic |
sdxl-base:fp16 |
25 | 5.1GB | Official base model |
SD 1.5 (lightweight)
| Model | Steps | Size | Good for |
|---|---|---|---|
sd15:fp16 |
25 | 1.7GB | Base model, huge ecosystem |
dreamshaper-v8:fp16 |
25 | 1.7GB | Best all-around SD1.5 |
realistic-vision-v5:fp16 |
25 | 1.7GB | Photorealistic |
SD 3.5
| Model | Steps | Size | Good for |
|---|---|---|---|
sd3.5-large:q8 |
28 | 8.5GB | 8.1B params, high quality |
sd3.5-large:q4 |
28 | 5.0GB | Same, smaller footprint |
sd3.5-large-turbo:q8 |
4 | 8.5GB | Fast 4-step |
sd3.5-medium:q8 |
28 | 2.7GB | 2.5B params, efficient |
Z-Image
| Model | Steps | Size | Good for |
|---|---|---|---|
z-image-turbo:q8 |
9 | 6.6GB | Fast 9-step generation |
z-image-turbo:q4 |
9 | 3.8GB | Lighter, still good |
z-image-turbo:bf16 |
9 | 12.2GB | Full precision |
Bare names default to
:q8for FLUX/Z-Image or:fp16for SD1.5/SDXL. Somold run flux-schnell "a cat"just works.
Server API
When running mold serve, you get a REST API:
# Generate an image
# Check status
# List models
# Interactive docs
Shell completions
| # fish
Requirements
- NVIDIA GPU with CUDA or Apple Silicon with Metal
- Models auto-download on first use (~2-12GB depending on model)
How it works
Mold is a single Rust binary built on candle — a pure Rust ML framework. No Python runtime, no libtorch, no ONNX. Just your GPU doing math.
mold run "a cat"
│
├─ Server running? → send request over HTTP
│
└─ No server? → load model locally on GPU
├─ Encode prompt (T5/CLIP text encoders)
├─ Denoise latent (transformer/UNet)
├─ Decode pixels (VAE)
└─ Save PNG