1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
//! Disk persistence for `jin` indexes.
//!
//! This module provides crash-safe, concurrent persistence for all retrieval methods:
//! - Sparse retrieval (BM25, TF-IDF): Inverted indexes with compressed postings
//! - Dense retrieval: Vector storage with ANN indexes (HNSW, IVF-PQ, DiskANN)
//! - Hybrid retrieval: Unified persistence for combined sparse + dense systems
//!
//! # Design Philosophy
//!
//! The persistence layer prioritizes:
//! - **Correctness**: Crash-safe, ACID guarantees, data integrity
//! - **Concurrency**: Multiple readers, single writer with snapshot isolation
//! - **Performance**: Memory mapping, SIMD-accelerated compression, efficient formats
//! - **Flexibility**: Support for all retrieval methods, configurable trade-offs
//!
//! # Future Improvements (Blob Storage)
//!
//! Large metadata and content blobs can cause write amplification and cache thrashing in standard storage engines (like Postgres).
//! Wilson Lin's 3B search engine found success using **RocksDB's BlobDB** feature:
//! - Store small metadata/pointers in LSM tree.
//! - Store large blobs in separate log files.
//! - Avoids rewriting large blobs during compaction.
//!
//! TODO: Investigate adding a `BlobStore` trait or wrapper here to support this pattern.
//!
//! See `docs/PERSISTENCE_DESIGN.md` for comprehensive design documentation.
//! See `docs/PERSISTENCE_DESIGN_DENSE.md` for dense retrieval specifics.
pub use PersistenceError;