Skip to main content

Crate embedcache

Crate embedcache 

Source
Expand description

EmbedCache - High-performance text embedding library with caching capabilities

This library provides functionality for generating text embeddings with various state-of-the-art models and caching the results for improved performance.

§Features

  • Multiple embedding models (BGE, MiniLM, Nomic, etc.)
  • Modular text chunking strategies with extensible trait-based architecture
  • LLM-based intelligent chunking (concept and introspection modes)
  • SQLite-based caching
  • Asynchronous operation

§Installation

Add this to your Cargo.toml:

[dependencies]
embedcache = "0.1"

§Examples

§Using as a library

use embedcache::{FastEmbedder, Embedder};
use fastembed::{InitOptions, EmbeddingModel};

let embedder = FastEmbedder {
    options: InitOptions::new(EmbeddingModel::BGESmallENV15),
};

let texts = vec![
    "This is an example sentence.".to_string(),
    "Another example sentence for embedding.".to_string(),
];

let embeddings = embedder.embed(&texts).await?;

§Implementing custom chunking strategies

use embedcache::ContentChunker;
use async_trait::async_trait;

struct MyCustomChunker;

#[async_trait]
impl ContentChunker for MyCustomChunker {
    async fn chunk(&self, content: &str, size: usize) -> Vec<String> {
        // Your custom chunking logic here
        vec![content.to_string()]
    }

    fn name(&self) -> &str {
        "my-custom-chunker"
    }
}

§Running the embedcache service

The library also includes a binary that can be run as a standalone service:

cargo install embedcache
embedcache

Re-exports§

pub use cache::cache_result;
pub use cache::get_from_cache;
pub use cache::initialize_db_pool;
pub use chunking::ContentChunker;
pub use chunking::LLMConceptChunker;
pub use chunking::LLMConfig;
pub use chunking::LLMIntrospectionChunker;
pub use chunking::WordChunker;
pub use config::ServerConfig;
pub use embedding::get_embedding_model;
pub use embedding::initialize_models;
pub use embedding::Embedder;
pub use embedding::FastEmbedder;
pub use embedding::SUPPORTED_MODELS;
pub use handlers::embed_text;
pub use handlers::list_supported_features;
pub use handlers::process_url;
pub use models::get_default_config;
pub use models::AppState;
pub use models::Config;
pub use models::InputData;
pub use models::InputDataText;
pub use models::ProcessedContent;
pub use utils::fetch_content;
pub use utils::generate_hash;

Modules§

cache
Cache module for storing processed content
chunking
Content chunking module
config
Server configuration module
embedding
Embedding generation module
handlers
HTTP request handlers module
models
Data models module
utils
Utility functions

Functions§

initialize_chunkers
Initialize chunkers