Expand description
Parallel commit pipeline for pre-allocated ID ranges.
Replaces the serial commit loop with a four-phase pipeline: Phase 2: Count + range assignment via prefix sums Phase 3: Parallel commit into disjoint pre-allocated ranges Phase 4: String dedup, remap, index build, edge bulk insert
§Phase 3 Architecture
Phase 3 uses split_at_mut to carve disjoint sub-slices from pre-allocated
arena and interner ranges, then uses rayon to commit each file’s staging
graph in parallel without locks:
NodeArena slots: [ file0 | file1 | file2 ]
StringInterner: [ file0 | file1 | file2 ]
↑ ↑ ↑
split_at_mut split_at_mut remainderEach file’s commit_single_file receives its own disjoint slices and
operates independently without contention.
Structs§
- Chunk
Commit Plan - Plan for parallel commit of a single chunk.
- File
Plan - Per-file commit plan with pre-assigned ID ranges.
- Global
Offsets - Running offsets carried across chunks for deterministic ID assignment.
- Phase3
Result - Phase 3 result: per-file edges and total written counts for validation.
Functions§
- compute_
commit_ plan - Compute commit plan from parsed files using prefix-sum range assignment.
- pending_
edges_ to_ delta - Convert per-file
PendingEdgecollections to per-fileDeltaEdgecollections with monotonically increasing sequence numbers. - phase2_
assign_ ranges - Execute Phase 2: count + range assignment for a parsed chunk.
- phase3_
parallel_ commit - Execute Phase 3: parallel commit into disjoint pre-allocated ranges.
- phase4_
apply_ global_ remap - Apply global string dedup remap to all nodes in the arena and all pending edges.
- remap_
edge_ kind_ string_ ids - Exhaustive remap of all
StringIdfields in anEdgeKind. - remap_
node_ entry_ global - Apply global string dedup remap to all
StringIdfields in aNodeEntry. - remap_
option_ string_ id - Remap an optional
StringIdusing the dedup remap table. - remap_
string_ id - Remap a required
StringIdusing the dedup remap table.