1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
//! BM25 index persistence: atomic snapshot save/load.
//!
//! All types and functions in this module are gated behind
//! `#[cfg(feature = "persistence")]`.
//!
//! ## On-disk layout
//!
//! ```text
//! <collection_dir>/
//! bm25.snapshot # Postcard-serialized [`Bm25Snapshot`]
//! bm25.wal # Write-ahead log (see [`bm25_persistence_wal`])
//! ```
//!
//! The snapshot captures the full in-memory state of the BM25 index
//! (documents, term frequencies, point/doc-id mappings, doc-count and
//! total length). The WAL captures mutations made after the most recent
//! snapshot. `load_snapshot` + `wal_replay` together restore the index
//! to its pre-shutdown state in O(snapshot) + O(WAL delta) time, which
//! replaces the prior O(N) payload-scan rebuild.
//!
//! ## Corruption handling
//!
//! `load_snapshot` returns `Ok(None)` only when the snapshot file is
//! absent (`NotFound`). Any other read error — including corrupt
//! bytes that fail postcard deserialization — surfaces as `Err`.
//! Silent fallback to the payload-rebuild path must be triggered by
//! the caller checking for `Ok(None)`; never by swallowing an `Err`.
//! See issue #618 for the Devin learning that motivates this
//! fail-fast contract.
use ;
use crate;
use crate;
use crateatomic_write;
/// Snapshot filename under a collection directory.
pub const BM25_SNAPSHOT_FILENAME: &str = "bm25.snapshot";
/// Returns the absolute path to the BM25 snapshot file under `dir`.
pub
/// Saves the BM25 index as an atomic snapshot under `dir/bm25.snapshot`.
///
/// Uses the `write-tmp-fsync-rename` pattern to guarantee that a crash
/// mid-save never leaves a torn snapshot file observable by the next
/// startup.
///
/// # Errors
///
/// Returns [`Error::Index`] if serialization or disk I/O fails.
pub
/// Loads the BM25 index from `dir/bm25.snapshot` if present.
///
/// - Returns `Ok(None)` when the snapshot file does not exist (backward
/// compat: the caller should fall back to the payload-rebuild path).
/// - Returns `Err(Error::Index(..))` when the file exists but cannot
/// be read or deserialized (corruption must surface loudly — never
/// silently fall back to rebuild, per issue #618 learning).
///
/// # Errors
///
/// Returns [`Error::Index`] when the file exists but is unreadable or
/// contains corrupt bytes that fail postcard deserialization.
pub
// Atomic snapshot writes use the shared `crate::storage::atomic_write` helper
// (write-tmp + fsync + rename), so the crash-safety logic lives in one place.