1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
//! Retrieval Augmented Generation (RAG) Pipeline
//!
//! This module provides the core RAG pipeline components for enhancing LLM responses
//! with relevant context from your document collections.
//!
//! # Module Structure
//!
//! - `rag::embeddings` - Dense embedding models (fastembed, 38+ models) **[requires `local-embeddings` feature]**
//! - [`rag::search`](crate::rag::search) - Search strategies (semantic, BM25, fuzzy, hybrid)
//! - `rag::reranker` - Cross-encoder reranking for improved relevance **[requires `local-embeddings` feature]**
//! - [`rag::chunker`](crate::rag::chunker) - Text chunking for document processing
//! - [`rag::cache`](crate::rag::cache) - Embedding cache for avoiding recomputation
//!
//! # Feature Flags
//!
//! The `local-embeddings` feature enables ONNX-based local embedding and reranking models.
//! This feature is optional because the ONNX runtime (`ort`) can have build issues on some platforms,
//! particularly Windows with certain MSVC versions.
//!
//! **Note:** The `local-embeddings` feature is NOT supported on Windows MSVC due to linker errors
//! in `ort-sys`. Use WSL, Linux, or macOS for local embeddings, or use remote embedding APIs.
//!
//! Without `local-embeddings`, you can still use:
//! - Remote embedding APIs (OpenAI embeddings, Ollama embeddings, etc.)
//! - The chunker and search modules
//! - The cache module (if you have embeddings from elsewhere)
//!
//! # RAG Pipeline
//!
//! The typical RAG pipeline flow:
//!
//! 1. **Ingestion** - Documents are chunked and embedded
//! 2. **Storage** - Embeddings stored in vector database
//! 3. **Retrieval** - Query embedded, similar chunks retrieved
//! 4. **Reranking** - Cross-encoder reranks for relevance
//! 5. **Generation** - LLM generates response with context
//!
//! # Example
//!
//! ```ignore
//! use ares::rag::{embeddings::EmbeddingModel, chunker::Chunker, search::SearchStrategy};
//!
//! // Embed a document
//! let embedder = EmbeddingModel::new("BAAI/bge-small-en-v1.5")?;
//! let chunker = Chunker::new(512, 50); // chunk_size, overlap
//!
//! let chunks = chunker.chunk(&document_text);
//! let embeddings = embedder.embed_batch(&chunks).await?;
//!
//! // Search
//! let query_embedding = embedder.embed(&query).await?;
//! let results = vector_store.search("my_collection", query_embedding, 10).await?;
//! ```
//!
//! # Embedding Models
//!
//! Supports 38+ models via fastembed. Popular choices:
//! - `BAAI/bge-small-en-v1.5` - Fast, good quality (default)
//! - `BAAI/bge-base-en-v1.5` - Higher quality, slower
//! - `sentence-transformers/all-MiniLM-L6-v2` - Lightweight
// Compile-time error for unsupported platform + feature combination
compile_error!;