codewalk
Walk a directory tree. Skip binaries, respect .gitignore, memory-map large files. Parallel mode for scanning big codebases.
use ;
let walker = new;
for entry in walker.walk
What the defaults do
Out of the box, codewalk skips:
- Binary files (detected by magic bytes, not just extension)
- Hidden files and directories
- Common junk directories: node_modules, .git, target, pycache, vendor, .venv, Pods
- Files over 10MB
It respects .gitignore rules automatically.
Why not walkdir or ignore?
walkdir gives you paths. ignore gives you paths respecting gitignore. Neither reads file content, detects binary files by magic bytes, or memory-maps large files. If you're building a security scanner or code analyzer, you need all three: walk, skip binaries, read content efficiently. codewalk does that in one call. Without it you're stacking walkdir + a binary detector + a gitignore parser + an mmap wrapper + size limits. codewalk is that stack, tested and ready.
Configuration
Override any default via struct fields or TOML:
= 1048576
= true
= false
= true
= false
= ["rs", "py", "js"]
= ["node_modules", ".git"]
let config = from_toml.unwrap;
Parallel walking
For large codebases, walk on multiple threads:
let rx = walker.walk_parallel;
for entry in rx
Memory-mapped reading
Files above 64KB (configurable) are memory-mapped instead of read into a Vec. Below that threshold, regular read is faster.
let content = entry.content.unwrap;
let bytes: & = content.as_bytes;
Binary detection
Checks file extension first (fast), then magic bytes if needed. Recognizes ELF, PE, Mach-O, WASM, ZIP, images, audio, databases, and more.
Contributing
Pull requests are welcome. There is no such thing as a perfect crate. If you find a bug, a better API, or just a rough edge, open a PR. We review quickly.
License
MIT. Copyright 2026 CORUM COLLECTIVE LLC.