oxicuda-vision 0.2.0

Vision Transformer & CLIP primitives for OxiCUDA: ViT patch embedding, multi-head self-attention, CLIP contrastive learning, FPN, RoI align, DETR decoder — pure Rust, zero CUDA SDK dependency.

Documentation

//! ConvNeXt modern-CNN components.
//!
//! Provides:
//! - **`ConvNextBlock`**: depthwise 7×7 same-pad convolution + channel
//!   LayerNorm + 1×1 inverted-bottleneck expansion → GELU → 1×1 projection +
//!   per-channel layer scale + residual (Liu et al. 2022 CVPR).

pub mod block;

pub use block::{ConvNextBlock, ConvNextConfig};