# getattrlistbulk
[](https://crates.io/crates/getattrlistbulk)
[](https://docs.rs/getattrlistbulk)
[](LICENSE)
Safe Rust bindings for the macOS `getattrlistbulk()` system call. Enumerate directories and retrieve file metadata in bulk with minimal syscalls.
## Why?
Traditional directory reading requires N+1 syscalls for N files:
```
opendir() → readdir() × N → stat() × N → closedir()
```
`getattrlistbulk()` retrieves entries AND metadata together:
```
open() → getattrlistbulk() × ceil(N/batch) → close()
```
For a directory with 10,000 files, this means ~10 syscalls instead of ~20,000.
## Requirements
- **macOS 10.10+** (Yosemite or later)
- **Rust 1.70+**
This crate only compiles on macOS. On other platforms, it will fail to compile with a clear error message.
## Installation
```toml
[dependencies]
getattrlistbulk = "0.1"
```
## Usage
### Basic Example
```rust
use getattrlistbulk::{read_dir, RequestedAttributes};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let attrs = RequestedAttributes {
name: true,
size: true,
object_type: true,
..Default::default()
};
for entry in read_dir("/Users/me/Documents", attrs)? {
let entry = entry?;
println!("{}: {} bytes", entry.name, entry.size.unwrap_or(0));
}
Ok(())
}
```
### Get All Available Metadata
```rust
use getattrlistbulk::{read_dir, RequestedAttributes};
let attrs = RequestedAttributes {
name: true,
object_type: true,
size: true,
alloc_size: true,
modified_time: true,
permissions: true,
inode: true,
entry_count: true, // for directories
};
for entry in read_dir("/path/to/dir", attrs)? {
let entry = entry?;
if let Some(modified) = entry.modified_time {
println!("{} last modified: {:?}", entry.name, modified);
}
}
```
### Custom Buffer Size
Larger buffers mean fewer syscalls for large directories:
```rust
use getattrlistbulk::{read_dir_with_buffer, RequestedAttributes};
let attrs = RequestedAttributes::default().with_name().with_size();
// 256KB buffer for very large directories
let entries = read_dir_with_buffer("/big/directory", attrs, 256 * 1024)?;
```
### Using the Builder
```rust
use getattrlistbulk::DirReader;
let entries = DirReader::new("/path/to/dir")
.name()
.size()
.object_type()
.buffer_size(128 * 1024)
.follow_symlinks(false)
.read()?;
```
## Performance
### Syscall Comparison
The performance advantage comes entirely from reducing syscalls:
| **Traditional POSIX** | `readdir()` + `stat()` per file | `O(n)` | ~20,000 syscalls |
| **Swift FileManager** | Uses POSIX internally | `O(n)` | ~10,000-20,000 syscalls |
| **getattrlistbulk** | Bulk metadata per call | `O(n/batch)` | **~12 syscalls** |
### Why This Matters
```
Traditional: opendir() → [readdir() + stat()] × N → closedir()
This crate: open() → getattrlistbulk() × ceil(N/800) → close()
```
Each syscall requires a user→kernel context switch. Reducing syscalls by **~1,600x** eliminates this overhead.
### Benchmarks (10,000 files)
Run the included benchmark yourself:
```bash
cargo bench --bench compare
```
Example output (Apple Silicon SSD):
| `std::fs::read_dir` + `metadata()` | ~19ms | 1.0x |
| **`getattrlistbulk`** | **~5ms** | **~4x faster** |
**Syscall reduction: ~1,600x fewer** (from ~20,000 to ~12)
### Why "only" 4x faster with 1,600x fewer syscalls?
On fast NVMe SSDs, the kernel's VFS cache handles most metadata requests in-memory. The syscall overhead (~1μs each) becomes the bottleneck only partially.
Expected speedups by storage type:
| NVMe SSD (cached) | ~4x | VFS cache masks I/O, syscall overhead partial |
| SATA SSD | ~5-8x | More I/O latency exposed |
| HDD | ~10-20x | Seek time dominates, batching helps significantly |
| Network (NFS/SMB) | ~20-50x | Round-trip latency makes batching critical |
### Swift Comparison
Swift's `FileManager` does NOT use `getattrlistbulk` internally—it wraps POSIX calls:
```swift
// Swift - still O(n) syscalls under the hood
let contents = try FileManager.default.contentsOfDirectory(
at: url,
includingPropertiesForKeys: [.fileSizeKey, .isDirectoryKey]
)
```
Swift *can* call `getattrlistbulk` via C interop, but Apple's high-level frameworks don't. This crate provides the optimized path that Apple's own tools use internally (Finder, `ls`, etc.).
## Comparison with Alternatives
| `std::fs` | No | No | Yes |
| `walkdir` | No | No | Yes |
| `jwalk` | No | No | Yes |
| **`getattrlistbulk`** | **Yes** | **Yes** | No |
Use this crate when:
- You're targeting macOS only
- You need to read large directories quickly
- You need metadata along with filenames
Use `std::fs` or `walkdir` when:
- You need cross-platform support
- You're reading small directories
- You don't need metadata
## Error Handling
```rust
use getattrlistbulk::{read_dir, RequestedAttributes, Error};
match read_dir("/some/path", RequestedAttributes::default()) {
Ok(entries) => { /* ... */ }
Err(Error::Open(e)) => eprintln!("Failed to open directory: {}", e),
Err(Error::Syscall(e)) => eprintln!("System call failed: {}", e),
Err(Error::Parse(msg)) => eprintln!("Buffer parsing error: {}", msg),
Err(Error::NotSupported) => eprintln!("Not running on macOS"),
}
```
## Safety
This crate uses `unsafe` internally to call the C system call, but exposes a fully safe public API. All buffer parsing is bounds-checked, and file descriptors are properly managed.
## License
Licensed under either of:
- Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.
## Contributing
Contributions welcome! Please read the [SPECIFICATION.md](SPECIFICATION.md) for implementation details and requirements.