1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
//! # iqdb-persist
//!
//! On-disk persistence for the **iQDB** vector database. Provides atomic
//! snapshot save / load with a versioned file header and a CRC32 integrity
//! check, generic over any type implementing [`iqdb_index::Index`].
//!
//! ## Tiered API
//!
//! - **Tier 1 — the lazy path.** [`PersistConfig::new`] plus
//! [`PersistedIndex::open_with`] / [`PersistedIndex::save`] /
//! [`PersistedIndex::load`] cover the whole common case — wrap an index,
//! save it, load it back — with no builder and no generics to name
//! beyond the index type itself. With the WAL on,
//! [`PersistedIndex::insert`] / [`PersistedIndex::delete`] /
//! [`PersistedIndex::checkpoint`] add durable, crash-recoverable
//! mutation.
//! - **Tier 2 — the configured path.** The [`PersistConfig`] fields
//! ([`fsync_policy`](PersistConfig::fsync_policy),
//! [`compression`](PersistConfig::compression),
//! [`wal_enabled`](PersistConfig::wal_enabled)) tune durability and
//! on-disk size.
//! - **Tier 3 — the trait seam.** An index opts into persistence by
//! implementing [`Persistable`]; everything in Tier 1 and Tier 2 then
//! works against it unchanged.
//!
//! ## Surface
//!
//! - [`Persistable`] — the trait an index implements. Two methods,
//! `save_to(&mut dyn Write)` and `load_from(&mut dyn Read) -> Result<Self>`,
//! plus a stable [`INDEX_TYPE`](Persistable::INDEX_TYPE) tag. The impl
//! serializes **only** the index's self-contained payload; the file
//! header, CRC32, and atomic write are added by [`PersistedIndex`]
//! around it.
//! - [`PersistedIndex`] — wraps an `I: Index + Persistable`. Two honest
//! constructors: [`open_with`](PersistedIndex::open_with) wraps an
//! in-memory index for later [`save`](PersistedIndex::save);
//! [`load`](PersistedIndex::load) reconstructs an index from disk and
//! errors if the file does not exist.
//! - [`FileHeader`] + [`MAGIC`] + [`CURRENT_VERSION`] — the wire format.
//! - [`PersistConfig`] / [`FsyncPolicy`] / [`Compression`] — configuration.
//! `wal_enabled = true` turns on the write-ahead log (v0.3);
//! `Compression::Zstd|Lz4` compress the snapshot payload (v0.4, behind
//! the `zstd` / `lz4` cargo features — selecting a scheme whose feature
//! is off yields [`PersistError::Unsupported`]).
//! - [`PersistError`] — `#[non_exhaustive]` and `error_forge::ForgeError`-
//! integrated.
//!
//! ## Three guards
//!
//! 1. The trait impl writes / reads **only** the index's self-contained
//! payload. Framing (header + CRC32) lives in [`PersistedIndex`].
//! 2. This crate stays generic over `I` — it never names a concrete
//! index. The `index_type` → concrete-type registry that
//! `Database::open` needs lives in the umbrella `iqdb` crate.
//! 3. Tests use a tiny in-crate mock `Persistable`; `iqdb-persist` never
//! dev-deps a concrete index crate.
//!
//! ## Scope
//!
//! **Stable as of v1.0.0.** The full surface — atomic snapshots + CRC32,
//! the write-ahead log with replay and crash recovery, and optional Zstd /
//! LZ4 snapshot compression — is complete, the parse/recovery paths are
//! adversarially hardened, and the public API and on-disk format are frozen
//! under the SemVer 1.x guarantee (no breaking changes before 2.0). The
//! external `storage-io` substrate is deferred behind the internal storage
//! seam and is out of scope for 1.0 (see `dev/ROADMAP.md`). See
//! `CHANGELOG.md`.
//!
//! ## Example
//!
//! ```
//! use iqdb_persist::{FileHeader, CURRENT_VERSION, MAGIC};
//! use iqdb_types::DistanceMetric;
//!
//! // A header is just data; tools can inspect a snapshot file without
//! // loading the index it carries.
//! let header = FileHeader {
//! magic: MAGIC,
//! version: CURRENT_VERSION,
//! index_type: "flat".to_string(),
//! dim: 128,
//! metric: DistanceMetric::Cosine,
//! n_vectors: 1_000,
//! crc32: 0,
//! };
//! assert_eq!(header.version, 2);
//! assert_eq!(&header.magic, b"IQDBPRST");
//! ```
use ;
use Index;
pub use crate;
pub use crate;
pub use crate;
pub use cratePersistedIndex;
// `Storage` stays internal to the `storage` module — it is the
// substrate seam for the future `storage-io` swap, not a v0.2 public
// extension point.
/// The version of this crate, taken from `Cargo.toml` at compile time.
///
/// # Examples
///
/// ```
/// let v = iqdb_persist::VERSION;
/// assert_eq!(v.split('.').count(), 3);
/// ```
pub const VERSION: &str = env!;
/// An index that can be written to and read from a byte stream.
///
/// The two methods serialize the index's **self-contained payload**: the
/// vectors, the ids, and the metadata. They do **not** write the file
/// header or the CRC32 — that framing is added by [`PersistedIndex`]
/// around the payload. Keeping the impl payload-only is what lets the
/// wire format stay centralized in this crate and uniform across every
/// future index implementation.
///
/// ## On-disk format contract
///
/// [`INDEX_TYPE`](Persistable::INDEX_TYPE) is stamped into the file
/// header on save and matched on load. Once snapshot files exist on
/// real users' disks with a given tag, **renaming the tag is a breaking
/// format change** — treat it with the same care as the magic bytes.
/// [`CURRENT_VERSION`] is for evolving the wire format; the type tag is
/// identity, not version.
///
/// ## Self-describing payload
///
/// [`load_from`](Persistable::load_from) reconstructs `Self` from the
/// payload alone — no header is passed in. The payload MUST therefore be
/// self-describing: the impl should re-state any state the constructor
/// needs (typically `dim` and `metric` for a vector index) at the start
/// of the payload. [`PersistedIndex::load`] cross-checks the
/// payload-reconstructed `Self`'s [`dim`](iqdb_index::IndexCore::dim) /
/// [`metric`](iqdb_index::IndexCore::metric) /
/// [`len`](iqdb_index::IndexCore::len) against the header values and
/// errors loudly on mismatch — a header claiming `dim = 128` over a
/// payload that says `96` is a corrupted file we catch the same way
/// CRC32 catches bit flips.
///
/// # Examples
///
/// ```no_run
/// use std::io::{Read, Write};
///
/// use iqdb_index::{Index, IndexCore, IndexStats};
/// use iqdb_persist::{Persistable, Result};
/// use iqdb_types::{
/// DistanceMetric, Hit, Metadata, Result as IqdbResult, SearchParams, VectorId,
/// };
///
/// # struct DummyIndex { dim: usize, metric: DistanceMetric }
/// # impl IndexCore for DummyIndex {
/// # fn insert(&mut self, _: VectorId, _: std::sync::Arc<[f32]>, _: Option<Metadata>) -> IqdbResult<()> { Ok(()) }
/// # fn delete(&mut self, _: &VectorId) -> IqdbResult<()> { Ok(()) }
/// # fn search(&self, _: &[f32], _: &SearchParams) -> IqdbResult<Vec<Hit>> { Ok(Vec::new()) }
/// # fn len(&self) -> usize { 0 }
/// # fn dim(&self) -> usize { self.dim }
/// # fn metric(&self) -> DistanceMetric { self.metric }
/// # fn flush(&mut self) -> IqdbResult<()> { Ok(()) }
/// # fn stats(&self) -> IndexStats { IndexStats { index_type: "dummy", ..IndexStats::default() } }
/// # }
/// # impl Index for DummyIndex {
/// # type Config = ();
/// # fn new(dim: usize, metric: DistanceMetric, _: ()) -> IqdbResult<Self> { Ok(Self { dim, metric }) }
/// # }
/// impl Persistable for DummyIndex {
/// const INDEX_TYPE: &'static str = "dummy";
/// fn save_to(&self, _w: &mut dyn Write) -> Result<()> { Ok(()) }
/// fn load_from(_r: &mut dyn Read) -> Result<Self> {
/// Ok(DummyIndex { dim: 1, metric: DistanceMetric::Cosine })
/// }
/// }
/// ```