tritter-accel
Rust acceleration library for Tritter - providing Python bindings for high-performance BitNet, ternary, and VSA operations.
Overview
tritter-accel provides Python bindings via PyO3 for the following capabilities:
- BitNet Ternary Operations: Efficient ternary weight packing and matmul
- VSA Gradient Compression: Vector Symbolic Architecture-based gradient compression
- Packed Weight Storage: 2-bit per trit storage with SIMD acceleration
This crate is designed to be used as a Python extension module, built with maturin.
Installation
From Source (Development)
Building a Wheel
Usage
# Quantize float weights to ternary {-1, 0, +1}
, =
# Pack for efficient storage (2 bits per trit)
, =
# Efficient matmul with packed weights
=
# VSA gradient compression for distributed training
=
=
API Reference
Weight Quantization
| Function | Description |
|---|---|
quantize_weights_absmean(weights) |
Quantize float weights to ternary using AbsMean scaling |
pack_ternary_weights(weights, scales) |
Pack ternary weights into 2-bit representation |
unpack_ternary_weights(packed, scales, shape) |
Unpack ternary weights to float |
Ternary Operations
| Function | Description |
|---|---|
ternary_matmul(input, packed, scales, shape) |
Matrix multiply with packed ternary weights |
ternary_matmul_simple(input, weights) |
Simple matmul with float ternary weights |
VSA Gradient Compression
| Function | Description |
|---|---|
compress_gradients_vsa(gradients, ratio, seed) |
Compress gradients using VSA |
decompress_gradients_vsa(compressed, shape, seed) |
Decompress gradients from VSA |
Dependencies
This crate uses the following sister crates:
bitnet-quantize- BitNet b1.58 quantizationtrit-vsa- Balanced ternary VSA operationsvsa-optim-rs- Gradient optimization
Architecture
┌─────────────────────────────────────────────────────────┐
│ Python Interface │
│ (tritter_accel module) │
├─────────────────────────────────────────────────────────┤
│ Exposed Functions │
│ - pack_ternary_weights │
│ - unpack_ternary_weights │
│ - ternary_matmul │
│ - quantize_weights_absmean │
│ - compress_gradients_vsa │
│ - decompress_gradients_vsa │
├─────────────────────────────────────────────────────────┤
│ Rust Implementation │
│ Core Crates: bitnet-quantize, trit-vsa, vsa-optim-rs │
│ - AbsMean quant | PackedTritVec + VSA ops │
│ - INT8 activs | Compression + Prediction │
└─────────────────────────────────────────────────────────┘
Performance
- Packing: 4x memory reduction (32-bit float → 2-bit trit)
- Matmul: 2-4x speedup via addition-only arithmetic
- VSA Compression: 10-100x gradient compression with <5% accuracy loss
License
MIT License - see LICENSE-MIT
Sister Crates
| Crate | Description | crates.io |
|---|---|---|
| trit-vsa | Balanced ternary arithmetic | |
| bitnet-quantize | BitNet b1.58 quantization | |
| vsa-optim-rs | VSA gradient optimization | |
| peft-rs | PEFT adapters | |
| qlora-rs | QLoRA implementation | |
| axolotl-rs | LLM fine-tuning toolkit |