codewalk
Walk a directory tree. Skip binaries, respect .gitignore, stream file contents in bounded chunks, and scan large trees in parallel.
use ;
let walker = new;
for entry in walker.walk.unwrap
What the defaults do
Out of the box, codewalk skips:
- Binary files (detected by magic bytes, not just extension)
- Hidden files and directories
- Common junk directories: node_modules, .git, target, pycache, vendor, .venv, Pods
- Files over 10MB
It respects .gitignore rules automatically.
Why not walkdir or ignore?
walkdir gives you paths. ignore gives you paths respecting gitignore. Neither reads file content, detects binary files by magic bytes, or offers a bounded chunked read path. If you're building a security scanner or code analyzer, you need all three: walk, skip binaries, read content efficiently. codewalk does that in one call. Without it you're stacking walkdir + a binary detector + a gitignore parser + chunked I/O + size limits. codewalk is that stack, tested and ready.
Configuration
Override any default via struct fields or TOML:
= 1048576
= true
= false
= true
= false
= ["rs", "py", "js"]
= ["node_modules", ".git"]
let config = from_toml.unwrap;
Parallel walking
For large codebases, walk on multiple threads:
let rx = walker.walk_parallel;
for entry in rx
Content loading
entry.content() classifies content as Text, Binary, or Unknown. entry.content_chunks() streams the same file in bounded 64 KiB chunks when you need backpressure-friendly reads.
let content = entry.content.unwrap;
let bytes: & = content.as_bytes;
let chunks = entry
.content_chunks
.unwrap
.
.unwrap;
assert!;
Binary detection
Checks file extension first (fast), then magic bytes if needed. Recognizes ELF, PE, Mach-O, WASM, ZIP, images, audio, databases, and more.
Contributing
Pull requests are welcome. There is no such thing as a perfect crate. If you find a bug, a better API, or just a rough edge, open a PR. We review quickly.
License
MIT. Copyright 2026 CORUM COLLECTIVE LLC.