1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
//! GPU forward paths for SD-1.5 sub-models (gated on `feature = "cuda"`).
//!
//! This module is the GPU twin of the CPU [`crate::vae`] /
//! [`crate::unet`] / [`crate::clip_text_encoder`] families. Sub-models
//! are added one at a time as their kernels land in
//! `ferrotorch-gpu`. The current surface:
//!
//! - [`vae::GpuVaeDecoder`] — VAE decoder forward path, mirroring
//! [`crate::vae::VaeDecoder`] op-for-op on CUDA.
//! - [`vae_encoder::GpuVaeEncoder`] — VAE encoder forward path,
//! mirroring [`crate::vae_encoder::VaeEncoder`] op-for-op on CUDA
//! (#1177). Composes the existing ferrotorch-gpu element kernels
//! plus `gpu_philox_normal` for the diagonal-Gaussian sample step.
//! - [`clip::GpuClipTextEncoder`] — SD-1.5 CLIP text-encoder forward
//! path, mirroring [`crate::clip_text_encoder::ClipTextEncoder`]
//! op-for-op on CUDA.
//! - [`unet::GpuUNet2DConditional`] — SD-1.5 UNet2DConditionModel
//! forward path, mirroring [`crate::unet::UNet2DConditionModel`]
//! op-for-op on CUDA.
//! - [`pipeline::GpuStableDiffusionPipeline`] — end-to-end SD-1.5
//! text-to-image generation pipeline composing the three GPU
//! sub-models above with the host-side
//! [`crate::scheduler::DDIMScheduler`]. Mirrors
//! [`crate::pipeline::StableDiffusionPipeline`] op-for-op on CUDA.
pub use GpuClipTextEncoder;
pub use GpuStableDiffusionPipeline;
pub use GpuUNet2DConditional;
pub use GpuVaeDecoder;
pub use GpuVaeEncoder;