Module structured

Expand description

§Structured Pruning

Removes entire structural units — channels, filters, or attention heads — rather than individual weights. Structured pruning produces weight matrices with rows or columns of zeros that can be physically removed, yielding real hardware speedups (unlike unstructured sparsity which requires special sparse kernels).

§Granularities

Granularity	Unit removed	Layout assumption
`Channel`	Output channel	`[n_out, n_in]` row-major
`Filter`	Convolutional filter	`[n_filters, filter_size]` flat
`Head`	Attention head	`[n_heads × head_dim, ...]`

Structs§

StructuredPruner: Removes structural units based on L2 norm importance.

Enums§

PruneGranularity: Structural unit to remove during pruning.

Module structured

Module structured Copy item path

§Structured Pruning

§Granularities

Structs§

Enums§

Module structured