Expand description
Tensor Ring Compression for LLM weights
This module implements tensor ring compression for LLM weight matrices, achieving parameter reduction while maintaining accuracy.
§Mathematical Foundation
Tensor Ring decomposition represents a high-dimensional tensor as a ring of 3D core tensors:
W(i₁,…,iₙ) = Σ Tr[G₁(i₁) × G₂(i₂) × … × Gₙ(iₙ)]
where Gₖ(iₖ) ∈ R^(rₖ₋₁×rₖ) and rₖ are the TR ranks controlling compression.
§Compression Ratio
For a weight matrix W ∈ R^(m×n) with TR ranks [r₀, r₁, r₂]:
- Original parameters: m × n
- TR parameters: r₀×m×r₁ + r₁×n×r₂
- Compression ratio: (m × n) / (r₀×m×r₁ + r₁×n×r₂)
§Example
use god_gragh::transformer::optimization::{TensorRingCompressor, CompressionConfig};
let config = CompressionConfig::new()
.with_target_ranks(vec![32, 64])
.with_layers(vec!["qkv".to_string(), "mlp".to_string()]);
let compressor = TensorRingCompressor::new(config);
// Compress a weight matrix
// let compressed_graph = compressor.compress_graph(&graph)?;
// Query compression ratio
// println!("Compression ratio: {:.2}x", compressor.compression_ratio());Structs§
- Compression
Config - Configuration for tensor ring compression
- Compression
Report - Overall compression report
- Layer
Compression Report - Compression report for a single layer
- Tensor
Ring Compressor - Tensor Ring compressor for LLM weights
Functions§
- adaptive_
rank_ selection - Adaptive rank selection based on singular value decay
- mixed_
precision_ compress - Mixed precision compression strategy