1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
//! Vector store abstraction.
//!
//! [`VectorStore`] is a **synchronous** trait that abstracts over the supported
//! vector backends:
//!
//! - [`SqliteVecStore`] (used internally by [`crate::db::RetrieveDb`]) — stores
//! chunk vectors inside the retrieve SQLite database via the sqlite-vec extension.
//! - `LanceDbVectorStore` (in `lancedb_store`) — stores chunk vectors in a
//! separate LanceDB directory; async LanceDB calls are wrapped in an internal
//! Tokio runtime so the trait remains sync.
//!
//! # Chunk identity
//!
//! Each chunk is identified by the pair `(doc_id, chunk_index)`. `doc_id` is a
//! stable i64 assigned by the caller (e.g. a path hash or application-level ID).
//! `chunk_index` is reproducibly derived from the paragraph order of the body.
use HashSet;
use crateResult;
// ── public types ──────────────────────────────────────────────────────────────
/// A single paragraph-level chunk derived from a document, ready to be embedded.
/// A result returned by [`VectorStore::search_similar`].
/// Statistics about the vector index.
// ── trait ─────────────────────────────────────────────────────────────────────
/// Abstraction over a vector storage backend.
///
/// All methods are **synchronous**. Backends that are inherently async
/// (e.g. LanceDB) wrap their async operations in an internal Tokio runtime.
// ── internal helpers shared by db.rs and lancedb_store.rs ─────────────────────
/// Serialize a float slice to the little-endian bytes expected by sqlite-vec.
pub