Vq
Vq (v[ector] q[uantizer]) is a vector quantization library for Rust. It provides implementations of popular quantization algorithms, including binary quantization (BQ), scalar quantization (SQ), product quantization (PQ), and tree-structured vector quantization (TSVQ).
Vector quantization is a technique to reduce the size of high-dimensional vectors by approximating them with a smaller set of representative vectors. It can be used for various applications such as data compression and nearest neighbor search to reduce the memory footprint and speed up search. For example, vector quantization can be used to reduce the size of data stored in a vector database or speed up the response time of a RAG-based application. For more information about vector quantization, check out this article from Weaviate documentation.
Features
- A simple and generic API for all quantizers
- Can reduce storage size of input vectors, at least 50% (2x)
- Good performance via SIMD acceleration (using Hsdlib), multi-threading, and zero-copying
- Support for multiple distances including Euclidean, cosine, and Manhattan distances
- Python 🐍 bindings via PyVq package
See ROADMAP.md for the list of implemented and planned features.
[!IMPORTANT] Vq is in early development, so bugs and breaking changes are expected. Please use the issues page to report bugs or request features.
Supported Algorithms
| Algorithm | Training Complexity | Quantization Complexity | Supported Distances | Input Type | Output Type | Storage Size Reduction |
|---|---|---|---|---|---|---|
| BQ | $O(1)$ | $O(nd)$ | — | &[f32] |
Vec<u8> |
75% |
| SQ | $O(1)$ | $O(nd)$ | — | &[f32] |
Vec<u8> |
75% |
| PQ | $O(nkd)$ | $O(nd)$ | All | &[f32] |
Vec<f16> |
50% |
| TSVQ | $O(n \log k)$ | $O(d \log k)$ | All | &[f32] |
Vec<f16> |
50% |
- $n$: number of vectors
- $d$: dimensionality of vectors
- $k$: number of centroids or clusters
Quantization Demo
Below is a visual comparison of different quantization algorithms applied to a 1024×1024 PNG image (using this Python script):
[!NOTE] The binary and scalar quantizers are applied per-channel (each pixel value independently), while PQ and TSVQ are applied per-row (each image row as a vector). PQ and TSVQ treat each image row as a high-dimensional vector, which causes the horizontal banding artifacts. Vq is primarily designed for embedding vector compression (like the ones stored in a vector database), where PQ and TSVQ are applied to vectors of typical dimensions 128–1536.
Getting Started
Installing Vq
[!NOTE] The
parallelandsimdfeatures enables multi-threading support and SIMD acceleration support for training phase of PQ and TSVQ algorithms. This can significantly speed up training time, especially for large datasets. Note that to enable thesimdfeature, a modern C compiler (like GCC or Clang) that supports C11 standard is needed.
Vq requires Rust 1.85 or later.
Installing PyVq
Python bindings for Vq are available via PyVq package. For more information, check out the pyvq directory.
Documentation
The Vq documentation is available here and the Rust API reference is available on docs.rs/vq.
Quick Example
Here's a simple example using the BQ and SQ algorithms to quantize vectors:
use ;
Product Quantizer Example
use ;
Benchmarks
You can follow the instructions below to run the benchmarks locally on your machine.
[!NOTE] To run the benchmarks, you need to have GNU Make installed. The
make eval-allcommand will run each quantizer on a set of high-dimensional synthetic data and report runtime (in millisecond) and reconstruction error (in mean squared error). See src/bin/common.rs for parameters used in the benchmarks like size of the training data, dimensions, etc.
Contributing
See CONTRIBUTING.md for details on how to make a contribution.
License
Vq is available under either of the following licenses:
- MIT License (LICENSE-MIT)
- Apache License, Version 2.0 (LICENSE-APACHE)
Acknowledgements
- This project uses Hsdlib C library for SIMD acceleration.