fdf - High-Performance POSIX File Finder
fdf is a high-performance POSIX file finder written in Rust with extensive C FFI. It serves as a lightweight alternative to tools such as fd and find, with a focus on speed, efficiency, and cross-platform compatibility. Benchmarks demonstrate fdf running up to 2x faster than comparable tools, achieved through low-level optimisation, SIMD techniques, and direct kernel interfacing.
Quick Installation:
Project Status
This is primarily a learning and performance exploration project. Whilst already useful and performant, it remains under active development towards a stable 1.0 release. The name 'fdf' is a temporary placeholder.
The implemented subset performs exceptionally well, surpassing fd in equivalent feature sets, though fd offers broader functionality. This project focuses on exploring hardware-specific code optimisation rather than replicating fd's complete feature set.
While the CLI is usable, the internal library is NOT suggested for use(except for trivial operations like getdents/readdir). I've done a lot of work to document but it requires a lot of cleanup too!
Platform Support (64-bit only)
Fully Supported and CI Tested
- Linux (x86_64, aarch64, s390x, RISC-V64, Alpine MUSL)
- macOS (Intel and Apple Silicon)
- FreeBSD (x86_64)
Compiles with Limited Testing
- OpenBSD, NetBSD, DragonflyBSD (tested occasionally, minor fixes expected if issues arise)
- Android (tested on device)
- Illumos and Solaris (x86_64, verified with QEMU)
Not Yet Supported
- Windows: Requires significant rewrite due to architectural differences with libc. Planned once the POSIX feature set is stable. Windows already has highly effective tools such as Everything.
Note: GitHub Actions does not yet provide Rust 2024 support for some platforms. Additional checks will be added when available.
Testing
The project includes comprehensive testing with 70+ Rust tests and 15+ correctness benchmarks comparing against fd.
Note: Miri validation (Rust's undefined behaviour detector) cannot be used due to the extensive libc calls and assembly code. Intensive testing and valgrind validation are used instead.
- Rust tests: Available here
- Shell scripts clone the LLVM repository to provide an accurate testing environment
- Tests run via GitHub Actions on all supported platforms
Running the Full Test Suite:
This executes a comprehensive suite of internal library tests, CLI tests, and benchmarks.
Performance Benchmarks
Complete benchmarks: Available here
The benchmarks are fully repeatable using the testing code above and cover file type filtering, extension matching, file sizes, and many other scenarios. The following results were obtained on a local system (rather than the LLVM project) to provide realistic usage examples: (These are tests done via hyperfine and summarised to save space here.)
| Test Case | fdf Time | fd Time | Speedup |
|---|---|---|---|
| Regex pattern matching | 431.6ms | 636.7ms | 1.48x faster |
| Files >1MB | 896.9ms | 1.732s | 1.93x faster |
| General search | - | - | 1.70x faster |
| Directory filtering | 461.8ms | 681.2ms | 1.48x faster |
Distinctions from fd/find
My method of resolving symlinks differs quite a bit, naturally I've had to adopt a slightly different approach.
*(I do hold the belief that following symlinks is not often wise, but I've included it for sake of completeness!)
So basically, when following symlinks, you can expect slightly different behaviour.
I've put an example of how fd would fail to escape an infinite loop of symlinks, in recursive_symlink_fs_test.sh.
Thus, when traversing symlinks, your computer should never hang, it may produce more results than expected, for this reason I suggest using the flag --same-file-system when traversing symlinks, fd and find don't handle them too well without these flags either.
My own computer will not terminate when following symlinks, this is due to the existence of ~/.steam. ~/.wine and /sys /proc etc.
It's my personal opinion that a program should always terminate regardless so I've put in defences against it, just a different implementation.
Technical Highlights
Key Optimisations
- Getdents: Improved the Linux specific implementation of this to be called far fewer types, see the GetDents struct for more information.
- find_char_in_word: Locates the first occurrence of a byte in a 64-bit word using SWAR (SIMD within a register), implemented as a const function
- Compile-time colour mapping: A compile-time perfect hashmap for colouring file paths, defined in a separate repository
Constant-Time Directory Entry Processing
The following function provides an elegant solution to avoid branch mispredictions/SIMD instructions during directory entry parsing (a performance-critical loop):
use dirent as dirent64;
use dirent64;
// Computational complexity: O(1) - truly constant time
// This is the little-endian implementation; see source for big-endian version(with better explanations!)
// Used on Linux/Solaris/Illumos systems; OpenBSD/macOS store name length trivially
// SIMD within a register, so no architecture dependence
pub const unsafe
Why?
I started this project because I found find slow and wanted to learn how to interface directly with the kernel. What began as a random experiment turned out to be a genuinely useful tool - one I'll probably use for the rest of my life, which is much more interesting than a project I'd just create and forget about.
At the core, this is about learning.
When I began I had barely used Linux for a few months, I didn't even know C, so there are some rough ABI edges. But along the way, I've picked up low-level skills and this project has been really useful for that!
Performance Motivation
Even though fdf is already faster than fd in all cases, I'm planning to experiment with filtering before allocation(I don't stop at good enough!) Rust's std::fs has some inefficiencies, too much heap allocation, file descriptor manipulation, constant strlen calculations, usage of readdir (not optimal because it implicitly stat calls every file it sees!). Rewriting all of it using libc was the ideal way to bypass that and learn in the process.
Notably the standard library will keep file descriptors open(UNIX specific) until the last reference to the inner ReadDir disappears,
fundamentally this means that this cause a lot more IO. It will also tend to call 'stat' style calls heavily which seemed inefficient
(I do have a shell script documenting syscall differences here(it's crude but it works well)) Available here
Development Philosophy
** Feature stability before breakage - I won't push breaking changes or advertise this anywhere until I've got a good baseline.
** Open to contributions - Once the codebase stabilises, I welcome others to add features if they're extremely inclined anyway!
** Pragmatic focus - Some areas, like datetime filtering, are especially complex and need a lot of investigation!
In short, this project is a personal exploration into performance, low-level programming, and building practical tools - with the side benefit that it's actually good at what it does.
Acknowledgements/Disclaimers
I've directly taken code from fnmatch-regex, found at the link and modified it so I could convert globs to regex patterns trivially, this simplifies the string filtering model by delegating it to rust's extremely fast regex crate. Notably I modified it because it's quite old and has dependencies I was able to remove
(I have emailed and received approval from the author above)
I've also done so for some SWAR tricks from the standard library (see link) I've found a much more rigorous way of doing some bit tricks via this.
I additionally emailed the author of memchr and got some nice tips, great guy, someone I respect whole heartedly!
I believe referencing similar work helps to aid in validating complex algorithms!
Future Plans
Modularisation
While avoiding excessive fragmentation, I plan to extract reusable components (like platform-specific FFI utilities) into separate crates. This will improve maintainability without sacrificing the project's cohesive design.
Feature Enhancements (Planned)
** API cleanup, currently the CLI is the main focus but I'd like to fix that eventually!)
DateTime Filtering: Fast, attribute-based file filtering by time (high priority despite personal infrequent use, I have a lot of test cases to attempt this, admittedly I've been focusing on tidying up the API a lot)
POSIX Compliance: Mostly done, I don't expect to extend this beyond Linux/BSD/MacOS/Illumos/Solaris (the other ones are embedded mostly, correct me if i'm wrong!)
Platform Expansion
Windows Support: Acknowledged as a significant undertaking an almost entire separate codebase(portability ain't fun), but valuable for both usability and learning Windows internals.
Enhance shell completions
Core Philosophy
The CLI will remain simple (avoiding overwhelming help menus(looking at you, ripgrep!)) and efficient (prioritising performance in both design and implementation).
Installation and Usage
# Clone & build
# Optional system install
# Find all JPG files in the home directory (excluding hidden files)
# Find all Python files in /usr/local (including hidden files)
# Generate shell completions for Zsh/bash (also supports powershell/fish!)
# For Zsh
# For Bash
## Options
)
)
)
)
)
)
)
)
)
)
)
)
Potential Future Enhancements
1. Custom Arena Allocator
-- Investigate implementing from scratch
-- Reference implementation: Microsoft's Edit Arena
-- (Caveat: see comments at https://github.com/microsoft/edit/blob/main/src/arena/release.rs,
-- this essentially says that my custom allocator (MiMalloc) is just as fast, I think it'd be interesting to test this!)
2. io_uring System Call Batching
-- Explore batched stat operations (and others as appropriate)
-- Significant challenges:
-- Current lack of getdents support in io_uring
-- Necessitates async runtime integration (potential Tokio dependency)
-- Conflicts with minimal-dependency philosophy
-- Linux only, that isn't too appealing for such a difficult addition (I'll probably not do it)
3. Native Threading Implementation
-- Replace Rayon dependency
-- Develop custom work-distribution algorithm
-- Current status: Experimental approaches underway
4. Allocation-Optimised Iterator Adaptor
-- Design filter mechanism avoiding:
-- Unnecessary directory allocations(via a closure with a function called on readdir/getdent)
*5. MacOS/BSD((potentially) Specific Optimisations) -- Implement an iterator using getattrlistbulk (this may be possible for bsd too? or perhaps just linking getdirentries for BSD systems) -- Test repo found at https://github.com/alexcu2718/mac_os_getattrlistbulk_ls -- This allows for much more efficient syscalls to get filesystem entries -- (Admittedly I've been a bit hesitant about this, because the API is quite complex and unwieldy!)