oxicuda-vision 0.1.7

Vision Transformer & CLIP primitives for OxiCUDA: ViT patch embedding, multi-head self-attention, CLIP contrastive learning, FPN, RoI align, DETR decoder — pure Rust, zero CUDA SDK dependency.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
//! Object detection components.
//!
//! Provides:
//! - **`roi_align`**: CPU reference RoI Align matching the `roi_align_ptx` kernel.
//! - **`DetrDecoder`**: multi-layer DETR decoder (pre-norm self-attn + cross-attn + FFN).
//! - **`bipartite_match`**: greedy+2-opt bipartite matching for DETR set-prediction loss.

pub mod detr_decoder;
pub mod roi_align;
pub mod set_match;

pub use detr_decoder::{DetrConfig, DetrDecoder, DetrDecoderLayer};
pub use roi_align::roi_align;
pub use set_match::{MatchCost, bipartite_match};