Expand description
EmbedCache - High-performance text embedding library with caching capabilities
This library provides functionality for generating text embeddings with various state-of-the-art models and caching the results for improved performance.
§Features
- Multiple embedding models (BGE, MiniLM, Nomic, etc.)
- Modular text chunking strategies with extensible trait-based architecture
- LLM-based intelligent chunking (concept and introspection modes)
- SQLite-based caching
- Asynchronous operation
§Installation
Add this to your Cargo.toml:
[dependencies]
embedcache = "0.1"§Examples
§Using as a library
use embedcache::{FastEmbedder, Embedder};
use fastembed::{InitOptions, EmbeddingModel};
let embedder = FastEmbedder {
options: InitOptions::new(EmbeddingModel::BGESmallENV15),
};
let texts = vec![
"This is an example sentence.".to_string(),
"Another example sentence for embedding.".to_string(),
];
let embeddings = embedder.embed(&texts).await?;§Implementing custom chunking strategies
use embedcache::ContentChunker;
use async_trait::async_trait;
struct MyCustomChunker;
#[async_trait]
impl ContentChunker for MyCustomChunker {
async fn chunk(&self, content: &str, size: usize) -> Vec<String> {
// Your custom chunking logic here
vec![content.to_string()]
}
fn name(&self) -> &str {
"my-custom-chunker"
}
}§Running the embedcache service
The library also includes a binary that can be run as a standalone service:
cargo install embedcache
embedcacheRe-exports§
pub use cache::cache_result;pub use cache::get_from_cache;pub use cache::initialize_db_pool;pub use chunking::ContentChunker;pub use chunking::LLMConceptChunker;pub use chunking::LLMConfig;pub use chunking::LLMIntrospectionChunker;pub use chunking::WordChunker;pub use config::ServerConfig;pub use embedding::get_embedding_model;pub use embedding::initialize_models;pub use embedding::Embedder;pub use embedding::FastEmbedder;pub use embedding::SUPPORTED_MODELS;pub use handlers::embed_text;pub use handlers::list_supported_features;pub use handlers::process_url;pub use models::get_default_config;pub use models::AppState;pub use models::Config;pub use models::InputData;pub use models::InputDataText;pub use models::ProcessedContent;pub use utils::fetch_content;pub use utils::generate_hash;
Modules§
- cache
- Cache module for storing processed content
- chunking
- Content chunking module
- config
- Server configuration module
- embedding
- Embedding generation module
- handlers
- HTTP request handlers module
- models
- Data models module
- utils
- Utility functions
Functions§
- initialize_
chunkers - Initialize chunkers