mold
Generate images from text on your own GPU. No cloud, no Python, no fuss.
That's it. Mold auto-downloads the model on first run and saves the image to your current directory.
Install
Nix (recommended)
# Run directly — no install needed
# Or add to your system
From source
Usage
# Generate an image
# Pick a model
# Reproducible results (the logo above was generated this way)
# Custom size and steps
Piping
Mold is pipe-friendly in both directions. When stdout is not a terminal, raw image bytes go to stdout and status/progress goes to stderr.
# Pipe output to an image viewer
|
# Pipe prompt from stdin
|
# Chain with other tools
| |
# Pipe in and out
| |
Output metadata
PNG output embeds generation metadata by default, including prompt, model, seed, size, steps, and a JSON mold:parameters chunk for downstream tools.
# Disable metadata for one run
# Disable metadata globally via environment
MOLD_EMBED_METADATA=0
In ~/.mold/config.toml:
= false
Image-to-image
Transform existing images with a text prompt:
# Stylize a photo
# Control how much changes (0.0 = no change, 1.0 = full denoise)
# Pipe an image through
| |
Inpainting
Selectively edit parts of an image with a mask (white = repaint, black = keep):
ControlNet (SD1.5)
Guide generation with a control image (edge map, depth map, etc.):
Scheduler selection
Choose the noise scheduler for SD1.5/SDXL models:
Batch generation
Generate multiple images with incrementing seeds:
Manage models
Remote rendering
Run mold on a beefy GPU server, generate from anywhere:
# On your GPU server
# From your laptop
MOLD_HOST=http://gpu-server:7680
Models
FLUX (best quality)
| Model | Steps | Size | Good for |
|---|---|---|---|
flux-schnell:q8 |
4 | 12GB | Fast, general purpose |
flux-schnell:q4 |
4 | 7.5GB | Same but lighter |
flux-dev:q8 |
25 | 12GB | Full quality |
flux-dev:q4 |
25 | 7GB | Full quality, less VRAM |
flux-krea:q8 |
25 | 12.7GB | Aesthetic photography |
flux-krea:fp8 |
25 | 11.9GB | Aesthetic photography, FP8 |
jibmix-flux:q4 |
25 | 6.9GB | Photorealistic fine-tune |
jibmix-flux:q5 |
25 | 8.4GB | Photorealistic fine-tune |
ultrareal-v4:q8 |
25 | 12.6GB | Photorealistic (latest) |
ultrareal-v4:q4 |
25 | 6.7GB | Photorealistic, lighter |
ultrareal-v3:q8 |
25 | 12.7GB | Photorealistic |
ultrareal-v2:bf16 |
25 | 23.8GB | Photorealistic, full precision |
iniverse-mix:fp8 |
25 | 11.9GB | Realistic SFW/NSFW mix |
SDXL (fast + flexible)
| Model | Steps | Size | Good for |
|---|---|---|---|
sdxl-turbo:fp16 |
4 | 5.1GB | Ultra-fast, 1-4 steps |
dreamshaper-xl:fp16 |
8 | 5.1GB | Fantasy, concept art |
juggernaut-xl:fp16 |
30 | 5.1GB | Photorealism, cinematic |
realvis-xl:fp16 |
25 | 5.1GB | Photorealism, versatile |
playground-v2.5:fp16 |
25 | 5.1GB | Artistic, aesthetic |
sdxl-base:fp16 |
25 | 5.1GB | Official base model |
pony-v6:fp16 |
25 | 5.1GB | Anime, art, stylized |
cyberrealistic-pony:fp16 |
25 | 5.1GB | Photorealistic Pony fine-tune |
SD 1.5 (lightweight)
| Model | Steps | Size | Good for |
|---|---|---|---|
sd15:fp16 |
25 | 1.7GB | Base model, huge ecosystem |
dreamshaper-v8:fp16 |
25 | 1.7GB | Best all-around SD1.5 |
realistic-vision-v5:fp16 |
25 | 1.7GB | Photorealistic |
SD 3.5
| Model | Steps | Size | Good for |
|---|---|---|---|
sd3.5-large:q8 |
28 | 8.5GB | 8.1B params, high quality |
sd3.5-large:q4 |
28 | 5.0GB | Same, smaller footprint |
sd3.5-large-turbo:q8 |
4 | 8.5GB | Fast 4-step |
sd3.5-medium:q8 |
28 | 2.7GB | 2.5B params, efficient |
Z-Image
| Model | Steps | Size | Good for |
|---|---|---|---|
z-image-turbo:q8 |
9 | 6.6GB | Fast 9-step generation |
z-image-turbo:q4 |
9 | 3.8GB | Lighter, still good |
z-image-turbo:bf16 |
9 | 12.2GB | Full precision |
Wuerstchen v2 / Flux.2 / Qwen-Image (alpha, improving on CUDA/MPS)
Warning: These model families are still in active alpha development. Results vary by backend and may be better on CUDA than Apple Silicon (MPS/Metal). Use FLUX, SDXL, SD1.5, SD3.5, or Z-Image for production use.
| Model | Steps | Size | Notes |
|---|---|---|---|
wuerstchen-v2:fp16 |
30 | 5.6GB | Alpha 3-stage cascade, backend-dependent output quality |
flux2-klein:q8 |
4 | 4.3GB | Alpha Flux.2 Klein 4B Q8, actively being improved |
flux2-klein:q4 |
4 | 2.6GB | Alpha Flux.2 Klein 4B Q4, smaller footprint |
flux2-klein:bf16 |
4 | 7.8GB | Alpha Flux.2 Klein 4B BF16, backend-dependent output quality |
qwen-image:q8 |
28 | 21.8GB | Alpha Qwen-Image-2512, actively being improved |
qwen-image:q4 |
28 | 12.3GB | Alpha Qwen-Image, smallest footprint |
Bare names resolve by trying
:q8→:fp16→:bf16→:fp8in order. Somold run flux-schnell "a cat"just works.
Server API
When running mold serve, you get a REST API:
# Generate an image
# Check status
# List models
# Interactive docs
Shell completions
| # fish
Requirements
- NVIDIA GPU with CUDA or Apple Silicon with Metal
- Models auto-download on first use (~2-30GB depending on model)
AI Agent Skill
Mold ships with an AI agent skill that teaches AI assistants how to use the CLI for image generation. This lets agents generate images on your behalf using natural language.
Claude Code
The skill is automatically available when working in the mold repo. To use it in other projects, copy the skill directory:
# Copy to your project (project-scoped)
# Or install globally (available in all projects)
Then use it via /mold a cat on a skateboard or let Claude invoke it automatically when you ask to generate images.
OpenClaw
Copy the skill to your OpenClaw workspace:
Or install directly from the repo:
&&
The skill format is compatible with both Claude Code and OpenClaw (both use SKILL.md with YAML frontmatter).
How it works
Mold is a single Rust binary built on candle — a pure Rust ML framework. No Python runtime, no libtorch, no ONNX. Just your GPU doing math.
mold run "a cat"
│
├─ Server running? → send request over HTTP
│
└─ No server? → load model locally on GPU
├─ Encode prompt (T5/CLIP text encoders)
├─ Denoise latent (transformer/UNet)
├─ Decode pixels (VAE)
└─ Save PNG