codewalk — Security-aware file system walker
Why
Every code security tool starts by enumerating files, but naive crawling is slow, noisy, and often reads binary files or vendor artifacts. codewalk gives you a predictable walker that respects .gitignore, supports extension and size filters, and exposes lazy content loading for both text and large files.
It is designed for scans where you care about throughput and signal: scanning only the files that matter, with minimal memory churn.
Quick Start
use ;
use Path;
Features
- Skip hidden files/directories and respect
.gitignoreby default. - Per-file size and extension allow/deny filtering.
- Binary detection + binary skip controls.
- Memory-mapped reads for large files with bounded threshold.
- Parallel walker mode for large mono-repos.
TOML Configuration
codewalk does not use TOML config files.
API Overview
WalkConfig: tune traversal behavior (max_file_size,include_extensions,exclude_dirs, ...).CodeWalker: build with root + config, iterate withwalk,walk_iter, orwalk_parallel.FileEntry: path/size/binary flags pluscontentandcontent_strmethods.FileSource: trait for custom file providers.is_binary: helper for pre-checks outside walker logic.
Examples
1) Crawl only non-hidden Rust files and collect text sizes
use ;
use HashSet;
let mut config = default.exclude_extensions;
let walker = new;
for file in walker.walk
2) Stream entries in parallel channels
use ;
let walker = new;
let rx = walker.walk_parallel;
while let Ok = rx.recv
3) Implement FileSource for a custom source
use ;
Traits
codewalk defines FileSource; implement it when you need scanning over virtual trees (API artifacts, generated code, archive contents).
Related Crates
License
MIT, Corum Collective LLC
Docs: https://docs.rs/codewalk
Santh ecosystem: https://santh.io