oxicuda-dnn 0.1.8

OxiCUDA DNN - GPU-accelerated deep learning primitives (cuDNN equivalent)
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
//! Fused linear (fully-connected) layer operations.
//!
//! Provides GPU-accelerated fused GEMM + bias + activation kernels for
//! dense layers in neural networks. By fusing these operations into a
//! single kernel pass, we eliminate intermediate memory round-trips for
//! the bias addition and activation function.
//!
//! | Sub-module       | Description                                     |
//! |------------------|-------------------------------------------------|
//! | [`mod@fused_linear`] | Fused `Y = activation(X @ W^T + bias)` kernel  |

pub mod fused_linear;

pub use fused_linear::{FusedLinearConfig, fused_linear};