tritter-accel 0.1.2

Rust acceleration for Tritter - BitNet, ternary ops, VSA optimization
Documentation

tritter-accel

Rust acceleration library for Tritter - providing Python bindings for high-performance BitNet, ternary, and VSA operations.

License: MIT

Overview

tritter-accel provides Python bindings via PyO3 for the following capabilities:

  • BitNet Ternary Operations: Efficient ternary weight packing and matmul
  • VSA Gradient Compression: Vector Symbolic Architecture-based gradient compression
  • Packed Weight Storage: 2-bit per trit storage with SIMD acceleration

This crate is designed to be used as a Python extension module, built with maturin.

Installation

From Source (Development)

cd rust-ai/tritter-accel
pip install maturin
maturin develop --release

Building a Wheel

maturin build --release
pip install target/wheels/tritter_accel-*.whl

Usage

from tritter_accel import (
    pack_ternary_weights,
    unpack_ternary_weights,
    ternary_matmul,
    quantize_weights_absmean,
    compress_gradients_vsa,
    decompress_gradients_vsa,
)

# Quantize float weights to ternary {-1, 0, +1}
ternary_weights, scales = quantize_weights_absmean(float_weights)

# Pack for efficient storage (2 bits per trit)
packed, scales = pack_ternary_weights(ternary_weights, scales)

# Efficient matmul with packed weights
output = ternary_matmul(input, packed, scales, original_shape)

# VSA gradient compression for distributed training
compressed = compress_gradients_vsa(gradients, compression_ratio=0.1)
recovered = decompress_gradients_vsa(compressed, original_shape)

API Reference

Weight Quantization

Function Description
quantize_weights_absmean(weights) Quantize float weights to ternary using AbsMean scaling
pack_ternary_weights(weights, scales) Pack ternary weights into 2-bit representation
unpack_ternary_weights(packed, scales, shape) Unpack ternary weights to float

Ternary Operations

Function Description
ternary_matmul(input, packed, scales, shape) Matrix multiply with packed ternary weights
ternary_matmul_simple(input, weights) Simple matmul with float ternary weights

VSA Gradient Compression

Function Description
compress_gradients_vsa(gradients, ratio, seed) Compress gradients using VSA
decompress_gradients_vsa(compressed, shape, seed) Decompress gradients from VSA

Dependencies

This crate uses the following sister crates:

  • bitnet-quantize - BitNet b1.58 quantization
  • trit-vsa - Balanced ternary VSA operations
  • vsa-optim-rs - Gradient optimization

Architecture

┌─────────────────────────────────────────────────────────┐
│                    Python Interface                      │
│              (tritter_accel module)                      │
├─────────────────────────────────────────────────────────┤
│  Exposed Functions                                      │
│  - pack_ternary_weights                                 │
│  - unpack_ternary_weights                               │
│  - ternary_matmul                                       │
│  - quantize_weights_absmean                             │
│  - compress_gradients_vsa                               │
│  - decompress_gradients_vsa                             │
├─────────────────────────────────────────────────────────┤
│                   Rust Implementation                    │
│  Core Crates: bitnet-quantize, trit-vsa, vsa-optim-rs    │
│  - AbsMean quant  |  PackedTritVec + VSA ops             │
│  - INT8 activs    |  Compression + Prediction            │
└─────────────────────────────────────────────────────────┘

Performance

  • Packing: 4x memory reduction (32-bit float → 2-bit trit)
  • Matmul: 2-4x speedup via addition-only arithmetic
  • VSA Compression: 10-100x gradient compression with <5% accuracy loss

License

MIT License - see LICENSE-MIT

Sister Crates

Crate Description crates.io
trit-vsa Balanced ternary arithmetic
bitnet-quantize BitNet b1.58 quantization
vsa-optim-rs VSA gradient optimization
peft-rs PEFT adapters
qlora-rs QLoRA implementation
axolotl-rs LLM fine-tuning toolkit