Expand description
Stable Diffusion
Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input.
- 💻 Original Repository
- 🤗 Hugging Face
- The default scheduler for the v1.5, v2.1 and XL 1.0 version is the Denoising Diffusion Implicit Model scheduler (DDIM). The original paper and some code can be found in the associated repo. The default scheduler for the XL Turbo version is the Euler Ancestral scheduler.
§Example
“A rusty robot holding a fire torch in its hand.” Generated by Stable Diffusion XL using Rust and candle.
# example running with cuda
# see the candle-examples/examples/stable-diffusion for all options
cargo run --example stable-diffusion --release --features=cuda,cudnn \
-- --prompt "a cosmonaut on a horse (hd, realistic, high-def)"
# with sd-turbo
cargo run --example stable-diffusion --release --features=cuda,cudnn \
-- --prompt "a cosmonaut on a horse (hd, realistic, high-def)" \
--sd-version turbo
# with flash attention.
# feature flag: `--features flash-attn`
# cli flag: `--use-flash-attn`.
# flash-attention-v2 is only compatible with Ampere, Ada, \
# or Hopper GPUs (e.g., A100/H100, RTX 3090/4090).
cargo run --example stable-diffusion --release --features=cuda,cudnn \
-- --prompt "a cosmonaut on a horse (hd, realistic, high-def)" \
--use-flash-attnModules§
- attention
- Attention Based Building Blocks
- clip
- Contrastive Language-Image Pre-Training
- ddim
- Denoising Diffusion Implicit Models
- ddpm
- embeddings
- euler_
ancestral_ discrete - Ancestral sampling with Euler method steps.
- resnet
- ResNet Building Blocks
- schedulers
- Diffusion pipelines and models
- unet_2d
- 2D UNet Denoising Models
- unet_
2d_ blocks - 2D UNet Building Blocks
- uni_pc
- UniPC Scheduler
- utils
- vae
- Variational Auto-Encoder (VAE) Models.