Crate embedcache

Expand description

EmbedCache - High-performance text embedding library with caching capabilities

This library provides functionality for generating text embeddings with various state-of-the-art models and caching the results for improved performance.

§Features

Multiple embedding models (BGE, MiniLM, Nomic, etc.)
Modular text chunking strategies with extensible trait-based architecture
LLM-based intelligent chunking (concept and introspection modes)
SQLite-based caching
Asynchronous operation

§Installation

Add this to your Cargo.toml:

[dependencies]
embedcache = "0.1"

§Examples

§Using as a library

use embedcache::{FastEmbedder, Embedder};
use fastembed::{InitOptions, EmbeddingModel};

let embedder = FastEmbedder {
    options: InitOptions::new(EmbeddingModel::BGESmallENV15),
};

let texts = vec![
    "This is an example sentence.".to_string(),
    "Another example sentence for embedding.".to_string(),
];

let embeddings = embedder.embed(&texts).await?;

§Implementing custom chunking strategies

use embedcache::ContentChunker;
use async_trait::async_trait;

struct MyCustomChunker;

#[async_trait]
impl ContentChunker for MyCustomChunker {
    async fn chunk(&self, content: &str, size: usize) -> Vec<String> {
        // Your custom chunking logic here
        vec![content.to_string()]
    }

    fn name(&self) -> &str {
        "my-custom-chunker"
    }
}

§Running the embedcache service

The library also includes a binary that can be run as a standalone service:

cargo install embedcache
embedcache

Re-exports§

pub use cache::cache_result;
pub use cache::get_from_cache;
pub use cache::initialize_db_pool;
pub use chunking::ContentChunker;
pub use chunking::LLMConceptChunker;
pub use chunking::LLMConfig;
pub use chunking::LLMIntrospectionChunker;
pub use chunking::WordChunker;
pub use config::ServerConfig;
pub use embedding::get_embedding_model;
pub use embedding::initialize_models;
pub use embedding::Embedder;
pub use embedding::FastEmbedder;
pub use embedding::SUPPORTED_MODELS;
pub use handlers::embed_text;
pub use handlers::list_supported_features;
pub use handlers::process_url;
pub use models::get_default_config;
pub use models::AppState;
pub use models::Config;
pub use models::InputData;
pub use models::InputDataText;
pub use models::ProcessedContent;
pub use utils::fetch_content;
pub use utils::generate_hash;

Modules§

cache: Cache module for storing processed content
chunking: Content chunking module
config: Server configuration module
embedding: Embedding generation module
handlers: HTTP request handlers module
models: Data models module
utils: Utility functions

Functions§

initialize_chunkers: Initialize chunkers

Crate embedcache

Crate embedcache Copy item path

§Features

§Installation

§Examples

§Using as a library

§Implementing custom chunking strategies

§Running the embedcache service

Re-exports§

Modules§

Functions§

Crate embedcache