Expand description
Parallel file walker for Layer 0 static analysis.
Uses ignore::WalkParallel (same engine as ripgrep) for parallel,
gitignore-aware directory traversal. Results stream to the caller via an
mpsc channel so downstream parsing can start before the walk completes.
§Architecture
Walker::walk_channel()
│
├── spawns std::thread (WalkParallel::visit is blocking/sync)
│ │
│ ├── VisitorBuilder::build() — one FileVisitor per worker thread
│ │
│ └── FileVisitor::visit() — per-entry filtering + local buffering
│ │
│ └── flush every FLUSH_THRESHOLD entries → mpsc::Sender
│ Drop flush handles tail entries
│
└── returns mpsc::Receiver<WalkedFile> (parser consumes while walk runs)
Walker::walk() — thin wrapper: collect channel → sort → Vec<WalkedFile>§Why thread-local buffering?
mpsc::Sender::send() acquires an internal lock on every call. With 8
threads and 80k files, 80k individual sends ≈ 16ms of contention overhead.
Flushing every FLUSH_THRESHOLD entries reduces sends to ~2 500,
cutting that overhead to ~500µs while still giving the receiver batches
early enough for meaningful parse pipelining.
Structs§
- Walked
File - A single file discovered by the walker.
- Walker
- Parallel, gitignore-aware file walker.
Enums§
- Language
- Programming language detected from file extension.
Constants§
- DEFAULT_
MAX_ FILE_ SIZE - Default maximum file size accepted by the walker (bytes). Files larger than this are silently skipped — they are almost always generated artefacts (minified JS, compiled output) not worth parsing.
Functions§
- detect_
language - Detect programming language from file extension.