dirwalk 1.1.0

Platform-optimized recursive directory walker with metadata
Documentation
# dirwalk

Recursive directory walker for Rust. Uses platform-specific APIs to reduce syscall overhead, and comes with built-in filtering, sorting, and parallel walk support.

Each platform uses the fastest available API for directory enumeration:

| Platform | API |
|----------|-----|
| Windows | `FindFirstFileW` / `FindNextFileW` |
| macOS | `getattrlistbulk` |
| Linux | `getdents64` + `statx` |
| Other Unix | `std::fs::read_dir` + `symlink_metadata` |

## Feature comparison

| Feature | dirwalk | walkdir | jwalk | ignore | fs-walk |
|---|---|---|---|---|---|
| Parallel walk | Yes | No | Yes | Yes | No |
| Gitignore | Yes | No | No | Yes | No |
| Hidden file filtering | Yes | Yes | Yes | Yes | No |
| Sorting | Yes | No | No | No | No |
| Grouping | Yes | No | No | No | No |
| Tree reconstruction | Yes | No | No | No | No |
| Glob filtering | Yes | No | No | Yes | No |
| Size filtering | Yes | No | No | No | No |
| Symlink loop detection | Yes | Yes | Yes | Yes | No |
| Max depth | Yes | Yes | Yes | Yes | No |
| Stats | Yes | No | No | No | No |
| Serde support | Yes | No | No | No | No |

## Library usage

```rust
use dirwalk::{WalkBuilder, Sort};

let result = WalkBuilder::new("/some/path")
    .max_depth(5)
    .hidden(false)           // exclude hidden files (default)
    .gitignore(true)         // respect .gitignore rules
    .extensions(&["rs", "toml"])
    .sort(Sort::Name)        // natural sort (file2 < file10)
    .dirs_first(true)
    .stats(true)
    .build()
    .unwrap();

for entry in &result.entries {
    println!("{} ({} bytes)", entry.relative_path, entry.size);
}

if let Some(stats) = &result.stats {
    println!("{} files, {} dirs, {} bytes",
        stats.file_count, stats.dir_count, stats.total_size);
}
```

### Single-level scan

```rust
use dirwalk::scan_dir;
use std::path::Path;

let entries = scan_dir(Path::new(".")).unwrap();
for entry in &entries {
    println!("{}: {} bytes", entry.name(), entry.size);
}
```

### Filtering

```rust
use dirwalk::WalkBuilder;

let result = WalkBuilder::new(".")
    .extensions(&["md", "rs"])    // only these extensions
    .glob("test_*").unwrap()      // glob pattern on name
    .min_size(100)                // minimum file size
    .max_size(1_000_000)          // maximum file size
    .gitignore(true)              // respect .gitignore
    .build()
    .unwrap();
```

### Grouping

```rust
use dirwalk::{WalkBuilder, group};

let result = WalkBuilder::new(".").build().unwrap();

let by_ext = group::group_by_extension(&result.entries);
let by_dir = group::group_by_directory(&result.entries);
let by_depth = group::group_by_depth(&result.entries);
```

### Tree reconstruction

```rust
use dirwalk::{WalkBuilder, tree};

let result = WalkBuilder::new(".").build().unwrap();
let tree = tree::to_tree(&result.entries);

for node in &tree {
    println!("{} ({} children)", node.entry.name(), node.children.len());
}
```

## CLI (`dirwalk` / `dw`)

```
cargo install dirwalk
```

The binary is available as both `dirwalk` and `dw`.

When stdout is a terminal, output defaults to a rich long format with
human-readable sizes, ls-style dates, colored names (via `LS_COLORS`), and
type indicators (`/` for directories, `@` for symlinks). When piped or
redirected, output defaults to plain paths for backward compatibility and
scripting.

```sh
dirwalk                                    # walk current directory (rich output on TTY)
dirwalk /some/path --max-depth 3
dirwalk . --extensions rs,toml --sort name --dirs-first
dirwalk . --short                          # columnar name-only layout
dirwalk . --format plain                   # force plain paths (one per line)
dirwalk . --format tree                    # tree view with box-drawing chars
dirwalk . --format json                    # JSON array
dirwalk . --format jsonl                   # one JSON object per line
dirwalk . --format csv                     # CSV with header
dirwalk . --gitignore --hidden --stats     # include hidden, respect gitignore, show stats
dirwalk . --glob "*.test.*" --min-size 100
dirwalk . --group-by extension --format plain
dirwalk . --color always | less -R         # force color through a pipe
```

### All flags

```
dirwalk [PATH]                             # defaults to .
    --max-depth <N>
    --hidden                             # include hidden files
    --gitignore                          # respect .gitignore rules
    --follow-links                       # follow symbolic links
    --extensions <ext,ext,...>
    --glob <pattern>
    --min-size <bytes>
    --max-size <bytes>
    --sort <name|modified|size|extension>
    --dirs-first
    --group-by <extension|directory|depth>
    --stats
    --format <plain|tree|json|jsonl|csv> # omit for auto (rich on TTY, plain when piped)
    --short                              # columnar name-only layout
    --color <never|auto|always>          # default: auto
    --no-color                           # shorthand for --color never
    --threads <N|fraction>               # 0 = all cores (default)
```

### Environment variables

| Variable | Effect |
|----------|--------|
| `NO_COLOR` | Disables color when set (see [no-color.org]https://no-color.org) |
| `LS_COLORS` | Controls file type coloring (same format as GNU ls) |

## Benchmarks

Synthetic fixtures with randomized file/directory names. All times are mean wall-clock. Parallel variants use all available cores. Bold values highlight the fastest result for each column, and the ratio tables below each benchmark express every library as a multiplier of `dirwalk (parallel)` (1.00× = same speed, higher = slower, lower = faster) so you can see the differences at a glance.

### Windows

**Platform:** Windows 11, AMD Ryzen 7 9800X3D (8 cores)

#### Scale (depth 3)

| | 1k files | 10k files | 100k files |
|---|---|---|---|
| dirwalk (parallel) | **0.35 ms** | **2.45 ms** | **24.7 ms** |
| dirwalk (sequential) | 1.99 ms | 18.8 ms | 206 ms |
| ignore (parallel) | 5.86 ms | 11.7 ms | 71.1 ms |
| ignore (sequential) | 5.50 ms | 50.2 ms | 531 ms |
| walkdir | 2.15 ms | 20.3 ms | 215 ms |
| jwalk | 11.9 ms | 122 ms | 1,262 ms |
| fs_walk | 46.7 ms | 486 ms | 4,741 ms |

**Relative speed vs `dirwalk (parallel)`**

| | 1k files | 10k files | 100k files |
|---|---|---|---|
| dirwalk (parallel) | 1.00× | 1.00× | 1.00× |
| dirwalk (sequential) | 5.69× | 7.67× | 8.34× |
| ignore (parallel) | 16.74× | 4.78× | 2.88× |
| ignore (sequential) | 15.71× | 20.49× | 21.50× |
| walkdir | 6.14× | 8.29× | 8.70× |
| jwalk | 34.00× | 49.80× | 51.09× |
| fs_walk | 133.43× | 198.37× | 191.94× |

#### Depth (10k files)

| | depth 2 | depth 5 | depth 10 |
|---|---|---|---|
| dirwalk (parallel) | **2.63 ms** | **2.92 ms** | **3.76 ms** |
| dirwalk (sequential) | 18.3 ms | 22.4 ms | 28.1 ms |
| ignore (parallel) | 12.0 ms | 13.8 ms | 15.4 ms |
| walkdir | 19.6 ms | 23.5 ms | 31.2 ms |

**Relative speed vs `dirwalk (parallel)`**

| | depth 2 | depth 5 | depth 10 |
|---|---|---|---|
| dirwalk (parallel) | 1.00× | 1.00× | 1.00× |
| dirwalk (sequential) | 6.96× | 7.67× | 7.47× |
| ignore (parallel) | 4.56× | 4.73× | 4.10× |
| walkdir | 7.45× | 8.05× | 8.30× |

#### Fanout (10k files, depth 3)

| | wide (10 files/dir) | narrow (1,000 files/dir) |
|---|---|---|
| dirwalk (parallel) | **2.38 ms** | **0.61 ms** |
| dirwalk (sequential) | 18.9 ms | 2.07 ms |
| ignore (parallel) | 11.9 ms | 6.52 ms |
| walkdir | 20.1 ms | 3.23 ms |

**Relative speed vs `dirwalk (parallel)`**

| | wide | narrow |
|---|---|---|
| dirwalk (parallel) | 1.00× | 1.00× |
| dirwalk (sequential) | 7.94× | 3.39× |
| ignore (parallel) | 5.00× | 10.69× |
| walkdir | 8.45× | 5.30× |

### macOS


**Platform:** macOS Sonoma 14.5 (Apple M1)

#### Scale (depth 3)

| | 1k files | 10k files | 100k files |
|---|---|---|---|
| dirwalk (parallel) | **1.73 ms** | **16.95 ms** | 154.21 ms |
| dirwalk (sequential) | 3.43 ms | 39.22 ms | 424.37 ms |
| ignore (parallel) | 4.44 ms | 17.05 ms | **151.52 ms** |
| ignore (sequential) | 3.26 ms | 36.34 ms | 395.5 ms |
| walkdir | 2.51 ms | 31.41 ms | 310.47 ms |
| jwalk | 1.81 ms | 21.25 ms | 210.59 ms |
| fs_walk | 5.53 ms | 60.4 ms | 640.67 ms |

**Relative speed vs `dirwalk (parallel)`**

| | 1k files | 10k files | 100k files |
|---|---|---|---|
| dirwalk (parallel) | 1.00× | 1.00× | 1.00× |
| dirwalk (sequential) | 1.98× | 2.31× | 2.75× |
| ignore (parallel) | 2.57× | 1.01× | 0.98× |
| ignore (sequential) | 1.88× | 2.14× | 2.56× |
| walkdir | 1.45× | 1.85× | 2.01× |
| jwalk | 1.05× | 1.25× | 1.37× |
| fs_walk | 3.20× | 3.56× | 4.15× |

#### Depth (10k files)

| | depth 2 | depth 5 | depth 10 |
|---|---|---|---|
| dirwalk (parallel) | 16.75 ms | **16.89 ms** | 20.02 ms |
| dirwalk (sequential) | 38.16 ms | 43.59 ms | 52.34 ms |
| ignore (parallel) | **16.73 ms** | 18.06 ms | **19.44 ms** |
| walkdir | 30.15 ms | 32.72 ms | 39.49 ms |

**Relative speed vs `dirwalk (parallel)`**

| | depth 2 | depth 5 | depth 10 |
|---|---|---|---|
| dirwalk (parallel) | 1.00× | 1.00× | 1.00× |
| dirwalk (sequential) | 2.28× | 2.58× | 2.61× |
| ignore (parallel) | 1.00× | 1.07× | 0.97× |
| walkdir | 1.80× | 1.94× | 1.97× |

#### Fanout (10k files, depth 3)

| | wide (10 files/dir) | narrow (1,000 files/dir) |
|---|---|---|
| dirwalk (parallel) | **15.47 ms** | **10.06 ms** |
| dirwalk (sequential) | 37.93 ms | 19.59 ms |
| ignore (parallel) | 16.99 ms | 12.24 ms |
| walkdir | 32.96 ms | 18.54 ms |

**Relative speed vs `dirwalk (parallel)`**

| | wide | narrow |
|---|---|---|
| dirwalk (parallel) | 1.00× | 1.00× |
| dirwalk (sequential) | 2.45× | 1.95× |
| ignore (parallel) | 1.10× | 1.22× |
| walkdir | 2.13× | 1.84× |

### Linux (WSL2)

**Platform:** WSL2 (Linux 6.6.87), AMD Ryzen 7 9800X3D (8 cores)

#### Scale (depth 3)

| | 1k files | 10k files | 100k files |
|---|---|---|---|
| dirwalk (parallel) | 1.5 ms | **2.2 ms** | **13.6 ms** |
| dirwalk (sequential) | **0.76 ms** | 6.7 ms | 81.1 ms |
| ignore (parallel) | 2.6 ms | 5.4 ms | 37.8 ms |
| ignore (sequential) | 1.4 ms | 13.0 ms | 146.9 ms |
| walkdir | 0.94 ms | 9.0 ms | 104.0 ms |
| jwalk | 1.2 ms | 7.2 ms | 93.5 ms |
| fs_walk | 2.2 ms | 21.8 ms | 240.7 ms |

**Relative speed vs `dirwalk (parallel)`**

| | 1k files | 10k files | 100k files |
|---|---|---|---|
| dirwalk (parallel) | 1.00× | 1.00× | 1.00× |
| dirwalk (sequential) | 0.51× | 3.05× | 5.96× |
| ignore (parallel) | 1.73× | 2.45× | 2.78× |
| ignore (sequential) | 0.93× | 5.91× | 10.80× |
| walkdir | 0.63× | 4.09× | 7.65× |
| jwalk | 0.80× | 3.27× | 6.88× |
| fs_walk | 1.47× | 9.91× | 17.70× |

#### Depth (10k files)

| | depth 2 | depth 5 | depth 10 |
|---|---|---|---|
| dirwalk (parallel) | **3.8 ms** | **2.3 ms** | **2.9 ms** |
| dirwalk (sequential) | 6.6 ms | 7.1 ms | 8.3 ms |
| ignore (parallel) | 5.2 ms | 4.9 ms | 5.1 ms |
| walkdir | 8.6 ms | 10.1 ms | 13.0 ms |

**Relative speed vs `dirwalk (parallel)`**

| | depth 2 | depth 5 | depth 10 |
|---|---|---|---|
| dirwalk (parallel) | 1.00× | 1.00× | 1.00× |
| dirwalk (sequential) | 1.74× | 3.09× | 2.86× |
| ignore (parallel) | 1.37× | 2.13× | 1.76× |
| walkdir | 2.26× | 4.39× | 4.48× |

#### Fanout (10k files, depth 3)

| | wide (10 files/dir) | narrow (1,000 files/dir) |
|---|---|---|
| dirwalk (parallel) | **2.3 ms** | **1.5 ms** |
| dirwalk (sequential) | 6.8 ms | 4.5 ms |
| ignore (parallel) | 5.9 ms | 3.6 ms |
| walkdir | 9.0 ms | 6.0 ms |

**Relative speed vs `dirwalk (parallel)`**

| | wide | narrow |
|---|---|---|
| dirwalk (parallel) | 1.00× | 1.00× |
| dirwalk (sequential) | 2.96× | 3.00× |
| ignore (parallel) | 2.57× | 2.40× |
| walkdir | 3.91× | 4.00× |

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, testing, and code style.

## License

MIT