1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
// SPDX-License-Identifier: Apache-2.0
// Copyright (c) 2024-present, fjall-rs
// Copyright (c) 2026-present, Structured World Foundation
// no-std foundation: when the `std` feature is OFF the crate root opts into
// `no_std`. Default builds keep `std` enabled (file I/O, threading, system
// clock all live in `std`), so existing consumers see no behaviour change.
// The migration to a fully no-std-clean build is incremental. Two patterns
// coexist while the port is in progress:
// 1. Modules with std-only dependencies that have no consumers above the
// `fs` / `version` / `tree` layer stay gated behind `#[cfg(feature =
// "std")]` and are ported in isolation.
// 2. Modules whose std-only dependencies cascade through every consumer
// (`manifest_blocks`, vendored `sfa`, others noted at their `pub mod`
// site) remain UNCONDITIONAL today — gating them would require
// `#[cfg]` annotations on dozens of call sites without changing the
// no-std error count, because those call sites are themselves
// std-bound for unrelated reasons. They migrate in lockstep with the
// consumer layer rather than ahead of it.
// The CI job `no-std-check` exercises `cargo check --no-default-features
// --features alloc` against a real no-std target (`thumbv7em-none-eabihf`)
// and tracks remaining work via the error count, which must monotonically
// decrease across PRs.
//! Embedded LSM-tree storage engine.
//!
//! Provides keyed point reads, prefix and range scans, MVCC snapshots, block
//! and file-descriptor caching, and a configurable compaction subsystem. No
//! write-ahead log — durability is the caller's responsibility (`flush_active_memtable`
//! forces persistence when needed).
//!
//! ## Highlights
//!
//! - **AMQ filter**: `BuRR` (Bumped Ribbon Retrieval, Walzer & Dillinger 2022) for
//! per-key and per-prefix membership checks. ~30% smaller filter blocks than a
//! same-FPR Bloom filter, or ~10× tighter FPR at the same memory budget.
//! - **Compression**: pure-Rust zstd (incl. dictionary mode), LZ4, or none —
//! per-table and per-level policy.
//! - **Encryption at rest**: AES-256-GCM block encryption with a caller-supplied
//! key.
//! - **Range tombstones**: `delete_range` / `delete_prefix` (SST-encoded; the
//! feature was added in disk format V4 and remains supported in the current
//! V5 format — the V5 break extends the block header for per-block Reed-
//! Solomon Page ECC, not the tombstone encoding).
//! - **Merge operators**: commutative-merge LSM operations with lazy resolution.
//! - **K/V separation (`BlobTree`)**: large-value workloads with automatic GC.
//! - **Pluggable `Fs`**: standard, in-memory, `io_uring`, or custom backends.
//! - **MVCC**: snapshot reads at a chosen `SeqNo`, custom `UserComparator`.
//! - **Concurrency**: thread-safe `BTreeMap`-like API.
//!
//! Keys: up to 65,535 bytes (`u16` length field). Values: up to 4,294,967,295
//! bytes (`u32` length field, `2³² − 1`). Larger keys and values
//! carry a proportional performance cost.
//!
//! ## Quick start
//!
//! ```no_run
//! use lsm_tree::{AbstractTree, Config, SequenceNumberCounter};
//!
//! let folder = tempfile::tempdir().unwrap();
//! let seqno = SequenceNumberCounter::default();
//! let tree = Config::new(&folder, seqno.clone(), SequenceNumberCounter::default())
//! .open()
//! .unwrap();
//!
//! tree.insert("key", "value", seqno.next());
//! let value = tree.get("key", lsm_tree::SeqNo::MAX).unwrap();
//! assert_eq!(value.map(|v| v.to_vec()), Some(b"value".to_vec()));
//! ```
//!
//! ## On-disk format
//!
//! Current version: **V5**. V5 introduces the `BuRR` filter wire format,
//! per-block Reed-Solomon Page ECC, and per-entry (per-KV) checksum
//! footers (collapsed into the same version because V5 had not shipped
//! when they landed): the self-describing block types (`Meta` / `Manifest` /
//! `ManifestFooter`) gain a `block_flags` byte whose `ECC_PARITY` bit marks a
//! parity trailer and whose `KV_CHECKSUM_FOOTER` bit marks a per-entry
//! checksum footer. SST block types (`Data` / `Index` / `Filter` /
//! `RangeTombstone`) keep the compact header WITHOUT that byte and derive parity / footer
//! presence from the per-SST meta descriptors (`descriptor#page_ecc`,
//! `descriptor#kv_checksum`). The block magic is bumped so a pre-V5 reader
//! rejects V5 blocks immediately at header decode.
//! A V5 SST written with every optional transform off (no Page ECC, no per-KV
//! footers) is still NOT byte-identical to a pre-V5 table — the bumped block
//! magic and the per-SST meta descriptor keys always differ. Any
//! "byte-identical when off" guarantees in the feature docs are within-V5 and
//! payload-level (e.g. index entries when `seqno_in_index = false`), not a
//! cross-version equivalence.
//! V3-V4 databases are not readable by this version and vice versa. The
//! manifest version gate rejects pre-V5 databases at `Tree::open` time.
//! V4 introduced range tombstones (still supported).
// The `#[test_log::test]` attribute macro expands to a body that places
// `use` items after statements, which trips `items_after_statements` in
// the test build (`clippy --all-targets`). The lint fires only on the
// macro's generated code, not on anything hand-written, so allow it
// crate-wide rather than annotating every instrumented test.
// Test fixtures routinely bind closely-named locals (`dict_a` / `dict_b`,
// `tree_1` / `tree_2`) where the similarity is the point; `similar_names`
// is pedantic noise on that style.
// Long scenario tests (fuzz harnesses, multi-phase integration cases)
// legitimately run past the `too_many_lines` ceiling; splitting them
// would obscure the scenario. The lint is only reached under
// `--all-targets`, which lints test bodies.
// `alloc` is the minimal hard dependency — the crate uses `Arc`, `Vec`,
// `Box`, and other heap types throughout. `extern crate alloc` makes the
// `alloc` crate root visible to `no_std` builds; under `std` it is a
// no-op alias because the standard library re-exports the same types.
extern crate alloc;
// f64 ceil / round: the native `f64` methods are std-only (they bind to
// platform float intrinsics), so the `no_std` build routes through `libm`.
// std keeps the native path (equal or faster); the results are identical.
pub
pub
pub
pub
pub
pub
pub
pub
pub
pub
// `hashbrown` (not `std::collections`) so the crate-wide map / set aliases
// compile on `no_std + alloc`. hashbrown IS the implementation std's HashMap
// wraps, so the API and performance match; using it directly just drops the
// std dependency. Hasher stays `FxHasher` (fast, non-DoS — internal keys are
// not attacker-controlled).
pub type HashMap<K, V> = HashMap;
pub type HashSet<K> = HashSet;
pub
// `checkpoint` is `pub(crate)`: it contains internal helpers
// (`link_or_copy_cross_fs`, `prepare_target`, `run_checkpoint`) used by
// `Tree::create_checkpoint` and `BlobTree::create_checkpoint`. Exposing
// it via `pub` (even with `#[doc(hidden)]`) would lock the helpers into
// the stable surface; tests that need to exercise them live inline as
// unit tests inside `src/checkpoint.rs`.
pub
// Vendored, `no_std`-ported `byteview` (backs `Slice`); kept in-tree so the
// engine carries no external dependency that fails to compile on `no_std`.
/// In-tree sharded S3-FIFO cache backing `cache` and `descriptor_table`
/// (replaces `quick_cache`; works on `std` and `no_std + alloc`).
/// Block-level encryption at rest.
/// Configuration
/// Shard-based Page ECC (XOR single-parity and Reed-Solomon).
///
/// Gated behind the `page_ecc` cargo feature so the
/// `reed-solomon-simd` dependency is only pulled in when the feature
/// is enabled.
/// Pluggable filesystem abstraction for I/O backends.
/// Local I/O trait surface mirroring `std::io::{Read, Write, Seek}`.
///
/// Provides `Error` / `ErrorKind` / `SeekFrom` plus the three trait
/// definitions so the bounds on the [`fs`] traits no longer carry
/// `std::io::*` directly. Under the `std` feature, supertrait
/// aliases + blanket impls forward to `std::io` types so existing
/// std-backed backends satisfy the trait surface automatically; the
/// alias form also propagates BACK to `std::io`, so a `dyn FsFile`
/// bounded on `crate::io::Read` still flows into `std::io::BufReader`,
/// `byteorder`, and friends.
///
/// Scope: this prerequisite slice (per #311) lifts only
/// `Read`/`Write`/`Seek` out of the [`fs`] trait bounds. The
/// `io::Result<T>` return types and `&Path` argument types in
/// `fs::Fs` / `fs::FsFile` still resolve to `std::io::Result<T>` and
/// `std::path::Path` and migrate in follow-up commits; the full
/// `--no-default-features --features alloc` build of the `fs::*`
/// surface arrives once those two follow-ups land.
pub
// Vendored sfa is std-only internally (`std::io` / `std::fs` /
// `std::path`). Unconditional for the same cascading reason as
// `manifest_blocks` above: ~20 consumers across the table /
// blob-file / inspect / verify / checkpoint paths reference sfa
// types in unconditional code. Gating sfa alone explodes the
// `no-std-check` error count via unresolved-module failures on
// every consumer that hasn't been gated yet. Migration is the
// whole std-bound layer at once (tracked as issue #358), not sfa
// in isolation.
pub
// mod multi_reader;
/// Runtime-toggleable configuration (`RuntimeConfig` + atomic-swap handle).
/// Disaster-recovery: rebuild a missing/corrupt manifest from on-disk SSTs.
// std-only: scans table folders and rewrites the manifest via std::fs.
pub
pub
pub
/// Single-error-correct / double-error-detect word codecs for the Page ECC
/// read path (pluggable SEC-DED shapes; default Hsiao `(72, 64)`).
///
/// Gated behind `page_ecc` and crate-internal: the codec only runs on the
/// Page ECC recovery path. Trailer sizing on the read path uses a plain
/// `ceil(len / 8)` so reading a SEC-DED SST does not require this module.
pub
pub use Clock;
pub use SystemClock;
pub use set_clock;
/// Utility functions
/// Integrity verification for SST and blob files.
///
/// Gated behind `std` because the verify pipeline uses `std::fs`,
/// `std::path::Path`, `std::io::{Read, Seek}` everywhere — full-file
/// hashing and per-block walking both need real filesystem I/O. Once
/// the `Fs`/`FsFile` trait surface is no-std-friendly (tracked under
/// the no-std migration), the gate can be relaxed.
/// Out-of-band inspection of a single SST file.
///
/// Public read-only view of stored metadata (table id, key range,
/// counts, compression, timestamp) without spinning up a `Tree`.
/// Used by `sst-dump properties` and similar diagnostic tools. See
/// the module docs for the recovery semantics (mirrors
/// `Table::recover`'s TAIL-first / MID-fallback path from #295).
/// ECC patrol scrub: a proactive sweep over Page-ECC-protected SST blocks.
///
/// Reads blocks to detect and correct latent bit-rot before it accumulates
/// past the parity budget. Std-gated for the same reason as [`verify`]: the
/// sweep needs real filesystem I/O and thread-based parallelism.
/// User defined key (byte array)
pub type UserKey = Slice;
/// User defined data (byte array)
pub type UserValue = Slice;
/// KV-tuple (key + value)
pub type KvPair = ;
// The `#[doc(hidden)]` block below re-exports crate internals that are reachable
// at the crate root for benchmarks and integration tests, but are NOT part of the
// public API contract — they carry no semver guarantee and may be renamed, moved,
// or removed without a major version bump. External callers that import these
// hidden items do so at their own risk. `cargo doc` excludes them from generated
// rustdoc output; only intra-crate test/bench code is expected to use them.
pub use ;
pub use ;
pub use EncryptionProvider;
pub use Aes256GcmProvider;
pub use BackgroundDeleter;
pub use PinnableSlice;
pub use RepairReport;
pub use WriteBatch;
pub use ;
pub use ;
pub use ZstdDictionary;
pub use Metrics;