oxicuda-quant 0.1.1

GPU-accelerated quantization and model compression engine for OxiCUDA
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
//! # Pruning
//!
//! Weight-pruning strategies for model compression.
//!
//! | Module      | Strategy                                               |
//! |-------------|--------------------------------------------------------|
//! | `mask`      | [`SparseMask`] — boolean weight mask primitives        |
//! | `magnitude` | [`MagnitudePruner`] — unstructured L1/L2 pruning       |
//! | `structured`| [`StructuredPruner`] — channel / filter / head pruning |

pub mod magnitude;
pub mod mask;
pub mod structured;

pub use magnitude::{MagnitudeNorm, MagnitudePruner};
pub use mask::SparseMask;
pub use structured::{PruneGranularity, StructuredPruner};