Expand description
Stable-Diffusion VAE encoder composition.
Mirrors diffusers.AutoencoderKL.encode(image).latent_dist for
runwayml/stable-diffusion-v1-5:
image (pixel-space, [B, 3, H, W])
-> Encoder.conv_in
-> Encoder.down_blocks[0..N] (last block has no Downsample2D)
-> Encoder.mid_block
-> Encoder.conv_norm_out -> SiLU -> Encoder.conv_out
(output: [B, 2 * latent_channels, H/8, W/8] — mean/logvar concat)
-> quant_conv ([2*L -> 2*L], 1x1)
-> DiagonalGaussianDistribution::from_parametersThe encoder-side mirror of crate::vae::VaeDecoder. The
encode_with_scaling helper composes
latent_dist.sample(seed) * scaling_factor, matching
AutoencoderKL.encode(x).latent_dist.sample() * vae.config.scaling_factor.
The bare Module::forward returns the raw [B, 2*L, h, w] parameters
tensor (no split, no sample, no scaling) so callers can swap in their
own sampling strategy (e.g. .mode() for deterministic decoding).
Structs§
- Diagonal
Gaussian Distribution - Diagonal Gaussian over latent space — the same parameterization
diffusers.models.autoencoders.vae.DiagonalGaussianDistributionuses. Holdsmeanandlogvartensors (both[B, L, h, w]) split from the encoder’s concatenated parameters output. - Encoder
- The bare
Encoderhalf — matchesdiffusers.models.autoencoders.vae.Encoder. - VaeEncoder
AutoencoderKL-style VAE encoder =Encoder+quant_conv.
Type Aliases§
- VaeEncoder
Config - Type alias — the SD VAE encoder and decoder share their config shape
(mirrors
diffusers.AutoencoderKL.config, which spans both halves).