Crate model2vec

Source
Expand description

Model2Vec Rust logo

§Fast State-of-the-Art Static Embeddings in Rust

Crates.io Docs.rs License: MIT

model2vec-rs is a Rust crate providing an efficient implementation for inference with Model2Vec static embedding models. Model2Vec is a technique for creating compact and fast static embedding models from sentence transformers, achieving significant reductions in model size and inference speed. This Rust crate is optimized for performance, making it suitable for applications requiring fast embedding generation.

§Quickstart

  1. Add model2vec as a dependency:
cargo add model2vec
  1. Load a model and generate embeddings:
use anyhow::Result;
use model2vec::Model2Vec;

fn main() -> Result<()> {
    // Load a model from a local directory
    // Arguments: (path, normalize_embeddings, subfolder_in_repo)
    let model = Model2Vec::from_pretrained(
        "tests/fixtures/test-model-float32", // Local path to model directory
        None, // Optional: bool to override model's default normalization. `None` uses model's config.
        None, // Optional: subfolder if model files are not at the root of the repo/path
    )?;

    // Any type that implements `AsRef<[S]>` works, where `S: AsRef<str>`
    // This includes Vec<String>, &[&str], etc.
    let sentences = [
        "Hello world",
        "Rust is awesome",
    ];

    // Generate embeddings using default parameters
    // (Default max_length: Some(512), Default batch_size: 1024)
    let embeddings = model.encode(&sentences)?;

    // `embeddings` is an ndarray::Array2<f32>, where each row is an embeddings
    // corresponding to a sentence, and each column is a different dimension.
    assert_eq!(embeddings.nrows(), sentences.len());
    println!("Generated {} embeddings.of {} dimensions", embeddings.nrows(), embeddings.ncols());

    // To generate embeddings with custom arguments:
    let custom_embeddings = model.encode_with_args(
        sentences,
        Some(256), // Optional: custom max token length for truncation
        512,       // Custom batch size for processing
    )?;
    assert_eq!(custom_embeddings.nrows(), sentences.len());
    println!("Generated {} custom embeddings of {} dimensions", custom_embeddings.nrows(), custom_embeddings.ncols());

    Ok(())
}

§Features

  • Fast Inference: Optimized Rust implementation for fast embedding generation.
  • Model Formats: Supports models with f32, f16, and i8 weight types stored in safetensors files.
  • Batch Processing: Encodes multiple sentences in batches.
  • Configurable Encoding: Allows customization of maximum sequence length and batch size during encoding.

§What is Model2Vec?

Model2Vec is a technique to distill large sentence transformer models into highly efficient static embedding models. This process significantly reduces model size and computational requirements for inference. For a detailed understanding of how Model2Vec works, including the distillation process and model training, please refer to the main Model2Vec Python repository and its documentation.

This model2vec crate provides a Rust-based engine specifically for inference using these Model2Vec models.

§Models

A variety of pre-trained Model2Vec models are available on the HuggingFace Hub (MinishLab collection). These can be loaded by model2vec-rs using their Hugging Face model ID or by providing a local path to the model files.

ModelLanguageDistilled From (Original Sentence Transformer)ParamsTask
potion-base-32MEnglishbge-base-en-v1.532.3MGeneral
potion-multilingual-128MMultilingualbge-m3128MGeneral
potion-retrieval-32MEnglishbge-base-en-v1.532.3MRetrieval
potion-base-8MEnglishbge-base-en-v1.57.5MGeneral
potion-base-4MEnglishbge-base-en-v1.53.7MGeneral
potion-base-2MEnglishbge-base-en-v1.51.8MGeneral

§Performance

We compared the performance of the Rust implementation with the Python version of Model2Vec. The benchmark was run single-threaded on a CPU.

ImplementationThroughput
Rust8000 samples/second
Python4650 samples/second

The Rust version is roughly 1.7× faster than the Python version.

§License

MIT

§Citing Model2Vec

If you use the Model2Vec methodology or models in your research or work, please cite the original Model2Vec project:

@article{minishlab2024model2vec,
  author = {Tulkens, Stephan and {van Dongen}, Thomas},
  title = {Model2Vec: Fast State-of-the-Art Static Embeddings},
  year = {2024},
  url = {https://github.com/MinishLab/model2vec}
}

Re-exports§

pub use crate::model::Model2Vec;

Modules§

model
Model2Vec loading and inference