dirwalk 1.0.0

Platform-optimized recursive directory walker with metadata
Documentation
dirwalk-1.0.0 has been yanked.

dirwalk

Recursive directory walker for Rust. Uses platform-specific APIs to reduce syscall overhead, and comes with built-in filtering, sorting, and parallel walk support.

Each platform uses the fastest available API for directory enumeration:

Platform API
Windows FindFirstFileW / FindNextFileW
macOS getattrlistbulk
Linux getdents64 + statx
Other Unix std::fs::read_dir + symlink_metadata

Feature comparison

Feature dirwalk walkdir jwalk ignore fs-walk
Parallel walk Yes No Yes Yes No
Gitignore Yes No No Yes No
Hidden file filtering Yes Yes Yes Yes No
Sorting Yes No No No No
Grouping Yes No No No No
Tree reconstruction Yes No No No No
Glob filtering Yes No No Yes No
Size filtering Yes No No No No
Symlink loop detection Yes Yes Yes Yes No
Max depth Yes Yes Yes Yes No
Stats Yes No No No No
Serde support Yes No No No No

Library usage

use dirwalk::{WalkBuilder, Sort};

let result = WalkBuilder::new("/some/path")
    .max_depth(5)
    .hidden(false)           // exclude hidden files (default)
    .gitignore(true)         // respect .gitignore rules
    .extensions(&["rs", "toml"])
    .sort(Sort::Name)        // natural sort (file2 < file10)
    .dirs_first(true)
    .stats(true)
    .build()
    .unwrap();

for entry in &result.entries {
    println!("{} ({} bytes)", entry.relative_path, entry.size);
}

if let Some(stats) = &result.stats {
    println!("{} files, {} dirs, {} bytes",
        stats.file_count, stats.dir_count, stats.total_size);
}

Single-level scan

use dirwalk::scan_dir;
use std::path::Path;

let entries = scan_dir(Path::new(".")).unwrap();
for entry in &entries {
    println!("{}: {} bytes", entry.name(), entry.size);
}

Filtering

use dirwalk::WalkBuilder;

let result = WalkBuilder::new(".")
    .extensions(&["md", "rs"])    // only these extensions
    .glob("test_*").unwrap()      // glob pattern on name
    .min_size(100)                // minimum file size
    .max_size(1_000_000)          // maximum file size
    .gitignore(true)              // respect .gitignore
    .build()
    .unwrap();

Grouping

use dirwalk::{WalkBuilder, group};

let result = WalkBuilder::new(".").build().unwrap();

let by_ext = group::group_by_extension(&result.entries);
let by_dir = group::group_by_directory(&result.entries);
let by_depth = group::group_by_depth(&result.entries);

Tree reconstruction

use dirwalk::{WalkBuilder, tree};

let result = WalkBuilder::new(".").build().unwrap();
let tree = tree::to_tree(&result.entries);

for node in &tree {
    println!("{} ({} children)", node.entry.name(), node.children.len());
}

CLI (dirwalk)

cargo install dirwalk
dirwalk                                    # walk current directory
dirwalk /some/path --max-depth 3
dirwalk . --extensions rs,toml --sort name --dirs-first
dirwalk . --format tree                    # tree view with box-drawing chars
dirwalk . --format json                    # JSON array
dirwalk . --format jsonl                   # one JSON object per line
dirwalk . --format csv                     # CSV with header
dirwalk . --gitignore --hidden --stats     # include hidden, respect gitignore, show stats
dirwalk . --glob "*.test.*" --min-size 100
dirwalk . --group-by extension --format plain

All flags

dirwalk [PATH]                             # defaults to .
    --max-depth <N>
    --hidden                             # include hidden files
    --gitignore                          # respect .gitignore rules
    --follow-links                       # follow symbolic links
    --extensions <ext,ext,...>
    --glob <pattern>
    --min-size <bytes>
    --max-size <bytes>
    --sort <name|modified|size|extension>
    --dirs-first
    --group-by <extension|directory|depth>
    --stats
    --format <plain|tree|json|jsonl|csv>

Benchmarks

Synthetic fixtures with randomized file/directory names. All times are mean wall-clock. Parallel variants use all available cores. Bold values highlight the fastest result for each column, and the ratio tables below each benchmark express every library as a multiplier of dirwalk (parallel) (1.00× = same speed, higher = slower, lower = faster) so you can see the differences at a glance.

Windows

Platform: Windows 11, AMD Ryzen 7 9800X3D (8 cores)

Scale (depth 3)

1k files 10k files 100k files
dirwalk (parallel) 0.35 ms 2.45 ms 24.7 ms
dirwalk (sequential) 1.99 ms 18.8 ms 206 ms
ignore (parallel) 5.86 ms 11.7 ms 71.1 ms
ignore (sequential) 5.50 ms 50.2 ms 531 ms
walkdir 2.15 ms 20.3 ms 215 ms
jwalk 11.9 ms 122 ms 1,262 ms
fs_walk 46.7 ms 486 ms 4,741 ms

Relative speed vs dirwalk (parallel)

1k files 10k files 100k files
dirwalk (parallel) 1.00× 1.00× 1.00×
dirwalk (sequential) 5.69× 7.67× 8.34×
ignore (parallel) 16.74× 4.78× 2.88×
ignore (sequential) 15.71× 20.49× 21.50×
walkdir 6.14× 8.29× 8.70×
jwalk 34.00× 49.80× 51.09×
fs_walk 133.43× 198.37× 191.94×

Depth (10k files)

depth 2 depth 5 depth 10
dirwalk (parallel) 2.63 ms 2.92 ms 3.76 ms
dirwalk (sequential) 18.3 ms 22.4 ms 28.1 ms
ignore (parallel) 12.0 ms 13.8 ms 15.4 ms
walkdir 19.6 ms 23.5 ms 31.2 ms

Relative speed vs dirwalk (parallel)

depth 2 depth 5 depth 10
dirwalk (parallel) 1.00× 1.00× 1.00×
dirwalk (sequential) 6.96× 7.67× 7.47×
ignore (parallel) 4.56× 4.73× 4.10×
walkdir 7.45× 8.05× 8.30×

Fanout (10k files, depth 3)

wide (10 files/dir) narrow (1,000 files/dir)
dirwalk (parallel) 2.38 ms 0.61 ms
dirwalk (sequential) 18.9 ms 2.07 ms
ignore (parallel) 11.9 ms 6.52 ms
walkdir 20.1 ms 3.23 ms

Relative speed vs dirwalk (parallel)

wide narrow
dirwalk (parallel) 1.00× 1.00×
dirwalk (sequential) 7.94× 3.39×
ignore (parallel) 5.00× 10.69×
walkdir 8.45× 5.30×

macOS

Platform: macOS Sonoma 14.5 (Apple M1)

Scale (depth 3)

1k files 10k files 100k files
dirwalk (parallel) 1.73 ms 16.95 ms 154.21 ms
dirwalk (sequential) 3.43 ms 39.22 ms 424.37 ms
ignore (parallel) 4.44 ms 17.05 ms 151.52 ms
ignore (sequential) 3.26 ms 36.34 ms 395.5 ms
walkdir 2.51 ms 31.41 ms 310.47 ms
jwalk 1.81 ms 21.25 ms 210.59 ms
fs_walk 5.53 ms 60.4 ms 640.67 ms

Relative speed vs dirwalk (parallel)

1k files 10k files 100k files
dirwalk (parallel) 1.00× 1.00× 1.00×
dirwalk (sequential) 1.98× 2.31× 2.75×
ignore (parallel) 2.57× 1.01× 0.98×
ignore (sequential) 1.88× 2.14× 2.56×
walkdir 1.45× 1.85× 2.01×
jwalk 1.05× 1.25× 1.37×
fs_walk 3.20× 3.56× 4.15×

Depth (10k files)

depth 2 depth 5 depth 10
dirwalk (parallel) 16.75 ms 16.89 ms 20.02 ms
dirwalk (sequential) 38.16 ms 43.59 ms 52.34 ms
ignore (parallel) 16.73 ms 18.06 ms 19.44 ms
walkdir 30.15 ms 32.72 ms 39.49 ms

Relative speed vs dirwalk (parallel)

depth 2 depth 5 depth 10
dirwalk (parallel) 1.00× 1.00× 1.00×
dirwalk (sequential) 2.28× 2.58× 2.61×
ignore (parallel) 1.00× 1.07× 0.97×
walkdir 1.80× 1.94× 1.97×

Fanout (10k files, depth 3)

wide (10 files/dir) narrow (1,000 files/dir)
dirwalk (parallel) 15.47 ms 10.06 ms
dirwalk (sequential) 37.93 ms 19.59 ms
ignore (parallel) 16.99 ms 12.24 ms
walkdir 32.96 ms 18.54 ms

Relative speed vs dirwalk (parallel)

wide narrow
dirwalk (parallel) 1.00× 1.00×
dirwalk (sequential) 2.45× 1.95×
ignore (parallel) 1.10× 1.22×
walkdir 2.13× 1.84×

Linux (WSL2)

Platform: WSL2 (Linux 6.6.87), AMD Ryzen 7 9800X3D (8 cores)

Scale (depth 3)

1k files 10k files 100k files
dirwalk (parallel) 1.5 ms 2.2 ms 13.6 ms
dirwalk (sequential) 0.76 ms 6.7 ms 81.1 ms
ignore (parallel) 2.6 ms 5.4 ms 37.8 ms
ignore (sequential) 1.4 ms 13.0 ms 146.9 ms
walkdir 0.94 ms 9.0 ms 104.0 ms
jwalk 1.2 ms 7.2 ms 93.5 ms
fs_walk 2.2 ms 21.8 ms 240.7 ms

Relative speed vs dirwalk (parallel)

1k files 10k files 100k files
dirwalk (parallel) 1.00× 1.00× 1.00×
dirwalk (sequential) 0.51× 3.05× 5.96×
ignore (parallel) 1.73× 2.45× 2.78×
ignore (sequential) 0.93× 5.91× 10.80×
walkdir 0.63× 4.09× 7.65×
jwalk 0.80× 3.27× 6.88×
fs_walk 1.47× 9.91× 17.70×

Depth (10k files)

depth 2 depth 5 depth 10
dirwalk (parallel) 3.8 ms 2.3 ms 2.9 ms
dirwalk (sequential) 6.6 ms 7.1 ms 8.3 ms
ignore (parallel) 5.2 ms 4.9 ms 5.1 ms
walkdir 8.6 ms 10.1 ms 13.0 ms

Relative speed vs dirwalk (parallel)

depth 2 depth 5 depth 10
dirwalk (parallel) 1.00× 1.00× 1.00×
dirwalk (sequential) 1.74× 3.09× 2.86×
ignore (parallel) 1.37× 2.13× 1.76×
walkdir 2.26× 4.39× 4.48×

Fanout (10k files, depth 3)

wide (10 files/dir) narrow (1,000 files/dir)
dirwalk (parallel) 2.3 ms 1.5 ms
dirwalk (sequential) 6.8 ms 4.5 ms
ignore (parallel) 5.9 ms 3.6 ms
walkdir 9.0 ms 6.0 ms

Relative speed vs dirwalk (parallel)

wide narrow
dirwalk (parallel) 1.00× 1.00×
dirwalk (sequential) 2.96× 3.00×
ignore (parallel) 2.57× 2.40×
walkdir 3.91× 4.00×

Contributing

See CONTRIBUTING.md for development setup, testing, and code style.

License

MIT