pub struct VitPatchEmbed { /* private fields */ }Expand description
ViT patch-embedding layer with [CLS] token and learnable positions.
Implementations§
Source§impl VitPatchEmbed
impl VitPatchEmbed
Sourcepub fn new(config: VitPatchConfig, rng: &mut VisionRng) -> VisionResult<Self>
pub fn new(config: VitPatchConfig, rng: &mut VisionRng) -> VisionResult<Self>
Construct a new patch embedder with random parameters.
The projection weight is initialised N(0, 1/√patch_dim) (the fan-in
scaling used by the reference implementations); biases are near-zero,
the class token is N(0, 0.02) (ViT’s truncated-normal scale), and the
positional embedding is N(0, 0.02).
§Errors
Propagates VitPatchConfig::validate failures.
Sourcepub fn config(&self) -> &VitPatchConfig
pub fn config(&self) -> &VitPatchConfig
Read-only access to the configuration.
Sourcepub fn forward(&self, image: &[f32]) -> VisionResult<Vec<f32>>
pub fn forward(&self, image: &[f32]) -> VisionResult<Vec<f32>>
Embed an image into a token sequence.
image must be [n_channels × image_size × image_size] row-major (CHW).
The returned sequence is [(n_patches + 1) × d_model]: a prepended
[CLS] token followed by the n_patches projected patch tokens, with
the learnable positional embedding added element-wise.
§Errors
VisionError::DimensionMismatchifimage.len()does not equaln_channels · image_size².VisionError::NonFiniteif a non-finite value is produced.
Auto Trait Implementations§
impl Freeze for VitPatchEmbed
impl RefUnwindSafe for VitPatchEmbed
impl Send for VitPatchEmbed
impl Sync for VitPatchEmbed
impl Unpin for VitPatchEmbed
impl UnsafeUnpin for VitPatchEmbed
impl UnwindSafe for VitPatchEmbed
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more