Module vae_encoder

Expand description

Stable-Diffusion VAE encoder composition.

Mirrors diffusers.AutoencoderKL.encode(image).latent_dist for runwayml/stable-diffusion-v1-5:

image (pixel-space, [B, 3, H, W])
  -> Encoder.conv_in
  -> Encoder.down_blocks[0..N]   (last block has no Downsample2D)
  -> Encoder.mid_block
  -> Encoder.conv_norm_out -> SiLU -> Encoder.conv_out
    (output: [B, 2 * latent_channels, H/8, W/8] — mean/logvar concat)
  -> quant_conv                  ([2*L -> 2*L], 1x1)
  -> DiagonalGaussianDistribution::from_parameters

The encoder-side mirror of crate::vae::VaeDecoder. The encode_with_scaling helper composes latent_dist.sample(seed) * scaling_factor, matching AutoencoderKL.encode(x).latent_dist.sample() * vae.config.scaling_factor. The bare Module::forward returns the raw [B, 2*L, h, w] parameters tensor (no split, no sample, no scaling) so callers can swap in their own sampling strategy (e.g. .mode() for deterministic decoding).

Structs§

DiagonalGaussianDistribution: Diagonal Gaussian over latent space — the same parameterization diffusers.models.autoencoders.vae.DiagonalGaussianDistribution uses. Holds mean and logvar tensors (both [B, L, h, w]) split from the encoder’s concatenated parameters output.
Encoder: The bare Encoder half — matches diffusers.models.autoencoders.vae.Encoder.
VaeEncoder: AutoencoderKL-style VAE encoder = Encoder + quant_conv.

Type Aliases§

VaeEncoderConfig: Type alias — the SD VAE encoder and decoder share their config shape (mirrors diffusers.AutoencoderKL.config, which spans both halves).