candle_embed 0.1.0

A simple, CUDA or CPU powered, library for creating vector embeddings using Candle and models from Hugging Face
docs.rs failed to build candle_embed-0.1.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

CandleEmbed

CandleEmbed is a Rust library for creating embeddings using BERT-based models. It provides a convenient way to load pre-trained models, embed single or multiple texts, and customize the embedding process. It's basically the same code as the candle example for embeddings, but with a nice wrapper. This exists because I wanted to play with Candle, and fastembed.rs doesn't support custom models.fs

Features

  • Enums for most popular embedding models OR specify custom models from HF

  • Support for CUDA devices (requires feature flag)

  • Can load and unload as required for better memory management

Installation

Add the following to your Cargo.toml file:

[dependencies]
candle_embed = "0.1.0"

If you want to use CUDA devices, enable the cuda feature flag:

[dependencies]
candle_embed = { version = "0.1.0", features = ["cuda"] }

Or you can just clone the repo. It's literally just a single file.

Usage


use candle_embed::{CandleEmbedBuilder, WithModel};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create a builder with default settings
    //
    let builder = CandleEmbedBuilder::new();
    
    //
    // Optional settings 
    //

    // Customize the builder
    //
    let builder = builder
        .normalize_embeddings(true)
        .approximate_gelu(true);

    // Set model from preset
    //
    builder
        .set_model_from_presets(WithModel::UaeLargeV1);

    // Or use a custom model and revision
    //
    builder
        .custom_embedding_model("avsolatorio/GIST-small-Embedding-v0")
        .custom_model_revision("d6c4190");

    // Will use the first available CUDA device (Default)
    //
    builder.with_device_any_cuda();

    // Use a specific CUDA device failing
    //
    builder.with_device_specific_cuda(ordinal: usize);

    // Use CPU (CUDA options fail over to this)
    //
    builder.with_device_cpu();

    // Build the embedder
    //
    let mut cembed = builder.build()?;
    
    // This loads the model and tokenizer into memory and is ran the first time `embed` is called
    // So you shouldn't nee to call this, but documenting here for clarity
    //
    cembed.load();

    // Embed a single text
    //
    let text = "This is a test sentence.";
    let embeddings = cembed.embed_one(text)?;
    
    // Embed a batch of texts
    //
    let texts = vec![
        "This is the first sentence.",
        "This is the second sentence.",
        "This is the third sentence.",
    ];
    let batch_embeddings = cembed.embed_batch(texts)?;
    
    // Unload the model and tokenizer, dropping them from memory
    //
    cembed.unload();
    
    Ok(())
}

Feature Flags

cuda: Enables CUDA support for using GPU devices.
default: No additional features are enabled by default.

License

This project is licensed under the MIT License. Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue if you have any suggestions or find any bugs.