1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
//! # `Batch::commit_grouped` — atomic-batch fsync
//!
//! 0.9.3 added a grouped-commit variant of `Batch::commit` that
//! amortises **parent-directory `fsync`** across the entire batch
//! instead of paying one per op.
//!
//! Each op still does its own data fsync (the temp file's content is
//! durable before its rename — that's the atomic-replace contract).
//! What `commit_grouped` changes is the **post-rename** `sync_parent_dir`
//! step: the regular path calls it once per op, the grouped path
//! accumulates unique parent directories and calls it once per unique
//! parent after the entire batch succeeds.
//!
//! For "flush 1024 SST files into one directory" — `commit` pays 1024
//! parent-dir syncs, `commit_grouped` pays **1**. On Linux/macOS this
//! is the biggest win available for bulk-load workloads; on Windows
//! the call is observably equivalent (directory durability is
//! implicit there).
//!
//! ## Trade-off
//!
//! `commit` makes each op individually durable (data + dirent) on
//! return. `commit_grouped` makes the batch durable **as a set** — a
//! crash mid-batch may leave a prefix of renames visible while the
//! dirent updates haven't yet landed in the filesystem journal. The
//! per-op data is still on disk (its data fsync still happened); on
//! the next reboot the filesystem journal replays the dirent updates.
//!
//! ## When to use this pattern
//!
//! - Bulk loads (initial database population, ETL imports)
//! - SST flushes (LSM-tree compaction emitting many files)
//! - Database checkpoint emissions (many files committed as one
//! logical checkpoint)
//!
//! Any workload where **the batch is the durability unit**, not the
//! individual op.
//!
//! ## When NOT to use this pattern
//!
//! Callers that need per-op dirent durability — rare, usually only
//! when each op IS a transaction commit point. Use `commit()` for
//! those.
//!
//! Run: `cargo run --example 19_batch_commit_grouped`
use PathBuf;