Module vae_encoder

Expand description

Stable-Diffusion VAE encoder composition.

Mirrors diffusers.AutoencoderKL.encode(image).latent_dist for runwayml/stable-diffusion-v1-5:

image (pixel-space, [B, 3, H, W])
  -> Encoder.conv_in
  -> Encoder.down_blocks[0..N]   (last block has no Downsample2D)
  -> Encoder.mid_block
  -> Encoder.conv_norm_out -> SiLU -> Encoder.conv_out
    (output: [B, 2 * latent_channels, H/8, W/8] — mean/logvar concat)
  -> quant_conv                  ([2*L -> 2*L], 1x1)
  -> DiagonalGaussianDistribution::from_parameters

The encoder-side mirror of crate::vae::VaeDecoder. The encode_with_scaling helper composes latent_dist.sample(seed) * scaling_factor, matching AutoencoderKL.encode(x).latent_dist.sample() * vae.config.scaling_factor. The bare Module::forward returns the raw [B, 2*L, h, w] parameters tensor (no split, no sample, no scaling) so callers can swap in their own sampling strategy (e.g. .mode() for deterministic decoding).

§REQ status (per `.design/ferrotorch-diffusion/vae_encoder.md`)

REQ	Status	Evidence
REQ-1	SHIPPED	`Encoder<T>` at `vae_encoder.rs:51..79` and `Encoder::new` at `vae_encoder.rs:81..153`; consumer: `VaeEncoder::new` at `vae_encoder.rs:307` builds it; itself consumed by `safetensors_loader.rs:425` `load_vae_encoder`
REQ-2	SHIPPED	`VaeEncoder<T>` at `vae_encoder.rs:288..297` and `VaeEncoder::new` at `vae_encoder.rs:299..316`; consumer: `safetensors_loader.rs:425` `load_vae_encoder`; `gpu/vae_encoder.rs:317` `GpuVaeEncoder::from_module` consumes its `state_dict()`
REQ-3	SHIPPED	`VaeEncoder::encode` at `vae_encoder.rs:325..328` and `DiagonalGaussianDistribution::from_parameters` at `vae_encoder.rs:471..501`; consumer: `vae_encoder.rs:349` `encode_with_scaling` invokes it
REQ-4	SHIPPED	`DiagonalGaussianDistribution::sample_with_seed` at `vae_encoder.rs:527..539`, `mode` at `vae_encoder.rs:506..508`, `randn_with_seed` at `vae_encoder.rs:548..587`; consumer: `vae_encoder.rs:350` `encode_with_scaling` calls `dist.sample_with_seed(seed)`
REQ-5	SHIPPED	`encode_with_scaling` at `vae_encoder.rs:348..361`; consumer: re-exported via `lib.rs:148` `pub use vae_encoder::VaeEncoder` (boundary method IS the public API per goal.md S5 grandfathering)
REQ-6	SHIPPED	`Module<T>::forward` at `vae_encoder.rs:369..382`; consumer: `vae_encoder.rs:326` `encode` calls `self.forward(image)?` to produce the `[B, 2L, h, w]` parameters
REQ-7	SHIPPED	`Module<T>::load_state_dict` at `vae_encoder.rs:421..444`; consumer: `safetensors_loader.rs:394` `VaeEncoder::load_hf_state_dict` calls `self.load_state_dict(&remapped, strict)` after stripping the `vae.` prefix

Structs§

DiagonalGaussianDistribution: Diagonal Gaussian over latent space — the same parameterization diffusers.models.autoencoders.vae.DiagonalGaussianDistribution uses. Holds mean and logvar tensors (both [B, L, h, w]) split from the encoder’s concatenated parameters output.
Encoder: The bare Encoder half — matches diffusers.models.autoencoders.vae.Encoder.
VaeEncoder: AutoencoderKL-style VAE encoder = Encoder + quant_conv.

Type Aliases§

VaeEncoderConfig: Type alias — the SD VAE encoder and decoder share their config shape (mirrors diffusers.AutoencoderKL.config, which spans both halves).