ruvector-temporal-tensor

Shrink your vector data 4-10x without losing the signal.

ruvector-temporal-tensor compresses streams of floating-point tensors by exploiting two properties that most vector workloads share:

Values within a group are similar — so a single scale factor per group captures the range, and a small integer code captures the value. This is groupwise symmetric quantization.
Consecutive frames barely change — so the same scale factors can be reused across many frames until the data drifts. This is temporal segment reuse.

The crate automatically picks the right bit-width based on how "hot" (frequently accessed) the tensor is, giving you aggressive compression on cold data while preserving accuracy on hot data.

Zero external dependencies. Compiles to WASM. Ships with a C FFI.

How It Works

f32 frame ──► tier policy ──► quantizer ──► bitpack ──► segment blob
                  │
         "How hot is this tensor?"
          Hot  → 8-bit (lossless-ish)
          Warm → 7 or 5-bit
          Cold → 3-bit (10x smaller)

Each frame of f32 values is divided into fixed-size groups (default 64). Per group, the compressor computes a single scale factor (max_abs / qmax) and maps every value to a signed integer code. Codes are packed into a tight bitstream with no byte-alignment waste.

When the next frame arrives, the compressor checks whether the existing scale factors still cover the new data (within a configurable drift tolerance). If they do, the frame is appended to the current segment — reusing the same scales. If they don't, the segment is finalized and a new one starts.

Segments are self-contained binary blobs with a 22-byte header, the f16-encoded scales, and the packed data. They can be decoded independently, or you can random-access a single frame by index.

Compression Ratios

Tier	Bits	Ratio vs f32	Typical Error	When Used
Hot	8	~4x	< 0.5%	Frequently accessed tensors
Warm	7	~4.6x	< 1%	Moderate access patterns
Warm	5	~6.4x	< 3%	Aggressively compressed warm data
Cold	3	~10.7x	< 15%	Rarely accessed / archival

Ratios improve further with temporal reuse — the scale overhead is amortized across all frames in a segment.

Quick Start

Add to your Cargo.toml:

[dependencies]
ruvector-temporal-tensor = "2.0"

Compress and decompress

use ruvector_temporal_tensor::{TemporalTensorCompressor, TierPolicy};

// 1. Create a compressor for 128-element tensors
let mut comp = TemporalTensorCompressor::new(TierPolicy::default(), 128, 0);
comp.set_access(100, 0); // mark as hot → 8-bit quantization

let frame = vec![1.0f32; 128];
let mut segment = Vec::new();

// 2. Push frames — segment stays empty until a boundary is crossed
comp.push_frame(&frame, 1, &mut segment);

// 3. Force-emit the current segment
comp.flush(&mut segment);

// 4. Decode back to f32
let mut decoded = Vec::new();
ruvector_temporal_tensor::segment::decode(&segment, &mut decoded);
assert_eq!(decoded.len(), 128);

Stream many frames

use ruvector_temporal_tensor::{TemporalTensorCompressor, TierPolicy};

let mut comp = TemporalTensorCompressor::new(TierPolicy::default(), 512, 0);
comp.set_access(100, 0);

let mut segments: Vec<Vec<u8>> = Vec::new();
let mut seg = Vec::new();

for t in 0..1000 {
    let frame: Vec<f32> = (0..512).map(|i| ((i + t) as f32 * 0.01).sin()).collect();
    comp.push_frame(&frame, t as u32, &mut seg);
    if !seg.is_empty() {
        segments.push(seg.clone());
    }
}
comp.flush(&mut seg);
if !seg.is_empty() {
    segments.push(seg);
}

Random-access a single frame

use ruvector_temporal_tensor::segment;
# use ruvector_temporal_tensor::{TemporalTensorCompressor, TierPolicy};
# let mut comp = TemporalTensorCompressor::new(TierPolicy::default(), 64, 0);
# let mut seg = Vec::new();
# comp.push_frame(&vec![1.0f32; 64], 0, &mut seg);
# comp.flush(&mut seg);

// Decode only frame 0 — skips all other frames in the segment
let values = segment::decode_single_frame(&seg, 0).unwrap();
assert_eq!(values.len(), 64);

// Check compression ratio
let ratio = segment::compression_ratio(&seg);
assert!(ratio > 1.0);

Custom tier policy

use ruvector_temporal_tensor::{TemporalTensorCompressor, TierPolicy};

let policy = TierPolicy {
    hot_min_score: 512,   // score threshold for 8-bit
    warm_min_score: 64,   // score threshold for warm tier
    warm_bits: 5,         // use 5-bit instead of default 7 for warm
    drift_pct_q8: 26,     // ~10% drift tolerance (Q8 fixed-point)
    group_len: 32,        // smaller groups = more scales, tighter fit
};

let mut comp = TemporalTensorCompressor::new(policy, 256, 0);

Feature Flags

[dependencies]
ruvector-temporal-tensor = { version = "2.0", features = ["ffi"] }

Feature	Default	Description
`ffi`	off	Enable `extern "C"` exports for WASM and C interop
`simd`	off	Reserved for future SIMD-accelerated quantization

API Reference

Core Types

Type	Description
`TemporalTensorCompressor`	Main entry point — push frames, get segments
`TierPolicy`	Controls bit-width selection and drift tolerance

Compressor Methods

Method	Description
`new(policy, len, now_ts)`	Create a compressor for tensors of `len` elements
`push_frame(frame, now_ts, out)`	Compress a frame; emits a segment on boundary crossings
`flush(out)`	Force-emit the current segment
`touch(now_ts)`	Record an access event (increments count + updates timestamp)
`set_access(count, ts)`	Set access stats directly (for restoring state)
`active_bits()`	Current quantization bit-width
`active_frame_count()`	Frames buffered in the current segment
`len()` / `is_empty()`	Tensor length

Segment Functions

Function	Description
`segment::decode(bytes, out)`	Decode all frames from a segment
`segment::decode_single_frame(bytes, idx)`	Decode one frame by index
`segment::parse_header(bytes)`	Read segment metadata without decoding
`segment::compression_ratio(bytes)`	Compute raw-to-compressed ratio
`segment::encode(...)`	Low-level segment encoder (used internally)

Low-Level Modules

Module	Description
`quantizer`	Groupwise symmetric quantization and dequantization
`bitpack`	Arbitrary-width bitstream packer and unpacker
`f16`	Software IEEE 754 half-precision conversion
`tier_policy`	Access-pattern scoring and bit-width selection

Segment Binary Format

Segments are self-contained, portable, and version-tagged:

Offset  Size  Field
──────  ────  ─────────────────
0       4     Magic: 0x43545154 ("TQTC")
4       1     Version (currently 1)
5       1     Bits per code (3, 5, 7, or 8)
6       4     Group length
10      4     Tensor length (elements per frame)
14      4     Frame count
18      4     Scale count (S)
22      2*S   Scales (f16, little-endian)
22+2S   4     Data length (D)
26+2S   D     Packed quantization codes

FFI / WASM Usage

Enable the ffi feature and compile with --target wasm32-unknown-unknown:

cargo build --release --target wasm32-unknown-unknown --features ffi

Exported C functions:

Function	Description
`ttc_create(len, now_ts, out_handle)`	Create compressor, get handle
`ttc_create_with_policy(...)`	Create with custom tier policy
`ttc_free(handle)`	Free a compressor
`ttc_touch(handle, now_ts)`	Record access
`ttc_set_access(handle, count, ts)`	Set access stats
`ttc_push_frame(handle, ts, in, len, out, cap, written)`	Compress a frame
`ttc_flush(handle, out, cap, written)`	Flush current segment
`ttc_decode_segment(seg, len, out, cap, written)`	Decode a segment
`ttc_alloc(size, out_ptr)`	Allocate WASM linear memory
`ttc_dealloc(ptr, cap)`	Free allocated memory

Design Decisions

See ADR-017 for the full architecture decision record, including SOTA survey, compression math, safety analysis, and integration guidance.

Key decisions:

Groupwise symmetric (no zero-point) — simpler, faster, well-suited for normally-distributed embeddings
f16 scales — 2 bytes per group vs 4 for f32, with negligible accuracy loss
64-bit bitstream accumulator — handles any sub-byte width without byte-alignment waste
Score-based tiering — access_count * 1024 / age balances recency and frequency
~10% drift tolerance — Q8 fixed-point configurable, default 26/256

Building and Testing

# Build
cargo build -p ruvector-temporal-tensor --release

# Run all tests (41 unit + 3 doc-tests)
cargo test -p ruvector-temporal-tensor

# Clippy
cargo clippy -p ruvector-temporal-tensor -- -W clippy::all

# Build WASM target
cargo build -p ruvector-temporal-tensor --release --target wasm32-unknown-unknown --features ffi

Related Crates

Crate	Relationship
ruvector-core	Parent vector database engine; temporal tensors integrate as a storage backend
ruvector-temporal-tensor-wasm	Thin WASM re-export wrapper

License

MIT License — see LICENSE for details.

Part of Ruvector

ruvector-temporal-tensor 2.0.6