Skip to main content

Module vae_encoder

Module vae_encoder 

Source
Expand description

Stable-Diffusion VAE encoder composition.

Mirrors diffusers.AutoencoderKL.encode(image).latent_dist for runwayml/stable-diffusion-v1-5:

image (pixel-space, [B, 3, H, W])
  -> Encoder.conv_in
  -> Encoder.down_blocks[0..N]   (last block has no Downsample2D)
  -> Encoder.mid_block
  -> Encoder.conv_norm_out -> SiLU -> Encoder.conv_out
    (output: [B, 2 * latent_channels, H/8, W/8] — mean/logvar concat)
  -> quant_conv                  ([2*L -> 2*L], 1x1)
  -> DiagonalGaussianDistribution::from_parameters

The encoder-side mirror of crate::vae::VaeDecoder. The encode_with_scaling helper composes latent_dist.sample(seed) * scaling_factor, matching AutoencoderKL.encode(x).latent_dist.sample() * vae.config.scaling_factor. The bare Module::forward returns the raw [B, 2*L, h, w] parameters tensor (no split, no sample, no scaling) so callers can swap in their own sampling strategy (e.g. .mode() for deterministic decoding).

§REQ status (per .design/ferrotorch-diffusion/vae_encoder.md)

REQStatusEvidence
REQ-1SHIPPEDEncoder<T> at vae_encoder.rs:51..79 and Encoder::new at vae_encoder.rs:81..153; consumer: VaeEncoder::new at vae_encoder.rs:307 builds it; itself consumed by safetensors_loader.rs:425 load_vae_encoder
REQ-2SHIPPEDVaeEncoder<T> at vae_encoder.rs:288..297 and VaeEncoder::new at vae_encoder.rs:299..316; consumer: safetensors_loader.rs:425 load_vae_encoder; gpu/vae_encoder.rs:317 GpuVaeEncoder::from_module consumes its state_dict()
REQ-3SHIPPEDVaeEncoder::encode at vae_encoder.rs:325..328 and DiagonalGaussianDistribution::from_parameters at vae_encoder.rs:471..501; consumer: vae_encoder.rs:349 encode_with_scaling invokes it
REQ-4SHIPPEDDiagonalGaussianDistribution::sample_with_seed at vae_encoder.rs:527..539, mode at vae_encoder.rs:506..508, randn_with_seed at vae_encoder.rs:548..587; consumer: vae_encoder.rs:350 encode_with_scaling calls dist.sample_with_seed(seed)
REQ-5SHIPPEDencode_with_scaling at vae_encoder.rs:348..361; consumer: re-exported via lib.rs:148 pub use vae_encoder::VaeEncoder (boundary method IS the public API per goal.md S5 grandfathering)
REQ-6SHIPPEDModule<T>::forward at vae_encoder.rs:369..382; consumer: vae_encoder.rs:326 encode calls self.forward(image)? to produce the [B, 2L, h, w] parameters
REQ-7SHIPPEDModule<T>::load_state_dict at vae_encoder.rs:421..444; consumer: safetensors_loader.rs:394 VaeEncoder::load_hf_state_dict calls self.load_state_dict(&remapped, strict) after stripping the vae. prefix

Structs§

DiagonalGaussianDistribution
Diagonal Gaussian over latent space — the same parameterization diffusers.models.autoencoders.vae.DiagonalGaussianDistribution uses. Holds mean and logvar tensors (both [B, L, h, w]) split from the encoder’s concatenated parameters output.
Encoder
The bare Encoder half — matches diffusers.models.autoencoders.vae.Encoder.
VaeEncoder
AutoencoderKL-style VAE encoder = Encoder + quant_conv.

Type Aliases§

VaeEncoderConfig
Type alias — the SD VAE encoder and decoder share their config shape (mirrors diffusers.AutoencoderKL.config, which spans both halves).