Module compression

Module compression 

Source
Expand description

Model compression and quantization for efficient embedding deployment

This module provides advanced compression techniques including quantization, pruning, knowledge distillation, and neural architecture search.

Structs§

Architecture
ArchitectureCandidate
CompressedModel
CompressionStats
CompressionTarget
Results and data structures
DistillationConfig
Knowledge distillation configuration
DistillationProcessor
Knowledge distillation processor
DistillationResult
HardwareConstraints
Hardware constraints for NAS
LayerConfig
ModelCompressionManager
Model compression manager
NASConfig
Neural Architecture Search configuration
NASProcessor
Neural Architecture Search processor
OptimalArchitecture
PruningConfig
Pruning configuration
PruningProcessor
Pruning processor
PruningResult
QuantizationConfig
Quantization configuration
QuantizationParams
Quantization parameters
QuantizationProcessor
Quantization processor
QuantizationResult

Enums§

ActivationType
DistillationType
Types of knowledge distillation
HardwarePlatform
Target hardware platforms
LayerType
OptimizationTarget
Optimization targets
PruningMethod
Pruning methods
PruningSchedule
Pruning schedules
QuantizationMethod
Quantization methods
SearchSpace
Architecture search spaces
SearchStrategy
Neural architecture search strategies