Expand description
Tensor-native compression library.
Exploits the mathematical structure of high-dimensional embeddings using Tensor Train decomposition, achieving 10-20x compression for 4096+ dimensions.
§Compression Methods
- Tensor Train (TT): Decomposes vectors into products of smaller tensors (recommended)
- Sparse: Native format for vectors with >50% zeros (stores only non-zeros)
- Delta + varint: Lossless compression for sorted ID sequences
- Run-length encoding: Lossless compression for repeated values
Re-exports§
pub use decompose::svd_truncated;pub use decompose::DecomposeError;pub use decompose::Matrix;pub use decompose::SvdResult;pub use decompose::TensorView;pub use format::compress_dense_as_sparse;pub use format::compress_sparse;pub use format::should_use_sparse;pub use format::should_use_sparse_threshold;pub use format::sparse_storage_size;pub use streaming_tt::convert_vectors_to_streaming_tt;pub use streaming_tt::read_streaming_tt_all;pub use streaming_tt::streaming_tt_similarity_search;pub use streaming_tt::StreamingTTHeader;pub use streaming_tt::StreamingTTReader;pub use streaming_tt::StreamingTTWriter;pub use streaming_tt::STREAMING_TT_MAGIC;pub use streaming_tt::STREAMING_TT_VERSION;pub use tensor_train::tt_cosine_similarity;pub use tensor_train::tt_cosine_similarity_batch;pub use tensor_train::tt_decompose;pub use tensor_train::tt_decompose_batch;pub use tensor_train::tt_dot_product;pub use tensor_train::tt_dot_product_batch;pub use tensor_train::tt_euclidean_distance;pub use tensor_train::tt_euclidean_distance_batch;pub use tensor_train::tt_norm;pub use tensor_train::tt_reconstruct;pub use tensor_train::tt_scale;pub use tensor_train::TTConfig;pub use tensor_train::TTCore;pub use tensor_train::TTError;pub use tensor_train::TTVector;
Modules§
- decompose
- Low-level matrix decomposition primitives for tensor operations.
- format
- Compressed snapshot format for tensor data.
- incremental
- Incremental (append-only) snapshot format.
- streaming
- Streaming compression for memory-bounded snapshot I/O.
- streaming_
tt - Streaming TT decomposition for memory-bounded I/O.
- tensor_
train - Tensor Train (TT) decomposition for high-dimensional embedding compression.
Structs§
- Compression
Config - Compression configuration for snapshots.
- Compression
Defaults - Common embedding dimension constants.
- RleEncoded
- Run-length encoded data: pairs of (value, count).
Enums§
- Tensor
Mode - Tensor compression mode for vectors and embeddings.
Functions§
- compress_
ids - Combined delta + varint encoding for maximum compression of sorted IDs.
- decompress_
ids - Decompress delta + varint encoded IDs.
- delta_
decode - Decode delta-encoded IDs back to original sorted list.
- delta_
encode - Delta-encode a sorted list of IDs. Stores first value followed by differences between consecutive values.
- rle_
decode - Decode RLE back to original data.
- rle_
encode - RLE-encode a slice of values.
- varint_
decode - Decode variable-length encoded bytes back to u64 values.
- varint_
encode - Variable-length encode u64 values. Uses 7 bits per byte with high bit as continuation flag.