haagenti-python
Python bindings for the Haagenti tensor compression library.
Features
- HCT Format: Block-compressed tensor storage with LZ4/Zstd compression
- 50-70% compression typical for neural network weights (fp16/bf16)
- 2-5x faster loading compared to safetensors (planned with GPU decompression)
- Progressive loading via HoloTensor (coming in Phase 4)
Installation
# Build from source (requires Rust 1.85+)
# Or build a wheel
Quick Start
# Compress a tensor
=
=
# Load a tensor
=
assert
# Convert safetensors to HCT
Low-level API
# Read HCT files
=
=
# Decompress all data
=
# Write HCT files
=
Compression Algorithms
| Algorithm | Speed | Ratio | Best For |
|---|---|---|---|
| LZ4 | Fast | 1.5-2x | Real-time loading |
| Zstd | Medium | 2-3x | Storage efficiency |
Data Types
F32- 32-bit float (default)F16- 16-bit floatBF16- BFloat16I8- 8-bit integer (quantized)I4- 4-bit integer (quantized)
Roadmap
- Phase 1: INT8/INT4 quantization support
- Phase 2: Smart component offloading
- Phase 3: Python bindings (this crate)
- Phase 4: HoloTensor progressive loading
- Phase 5: GPU decompression kernels
License
MIT