fdf - High-Performance POSIX File Finder
fdf is a high-performance POSIX file finder written in Rust with extensive C FFI.
It serves as a lightweight alternative to tools such as fd and find, with a focus on speed, efficiency, and cross-platform compatibility. Benchmarks demonstrate fdf running up to 2x faster than comparable tools, achieved through low-level optimisation, SIMD techniques, and direct kernel interfacing.
PLEASE NOTE: This is due to undergo a rename before a 1.0
Quick Installation:
Project Status
This is primarily a learning and performance exploration project. Whilst already useful and performant, it remains under active development towards a stable 1.0 release. The name 'fdf' is a temporary placeholder.
The implemented subset performs exceptionally well, surpassing fd in equivalent feature sets, though fd offers broader functionality. This project focuses on exploring hardware-specific code optimisation rather than replicating fd's complete feature set.
While the CLI is usable, the internal library is not stable yet. Alas!
Platform Support (64-bit only)
Fully Supported and CI Tested
- Linux (x86_64, s390x (Big endian), Alpine( MUSL libc))
- macOS (Intel and Apple Silicon)
- FreeBSD (x86_64)
Compiles with Limited Testing
Note: GitHub Actions does not yet provide Rust 2024 support for some(most of these) platforms. Additional checks will be added when available.
-
OpenBSD, NetBSD, DragonflyBSD (tested occasionally, minor fixes expected if issues arise, tested on QEMU occasionally)
-
Android (tested on my phone)
-
Illumos and Solaris (x86_64, verified with QEMU)
-
I have removed aarch64 Linux and riscv Linux from Github actions due to VERY UNRELIABLE RUNNERS
Not Yet Supported
- Windows: Requires significant rewrite due to architectural differences with libc. Planned once the POSIX feature set is stable. Windows already has highly effective tools such as Everything. The plan is this to work on this after a 1.0.
Non supported filesystems
This tool doesn't support reiserfs in any form, due to it's extremely long filename length, every other file system is supported, it's not worth sacrificing the performance improvements to support an extremely niche fs that is used by 0.001% of people(if that...).
It's deliberately got a build script to stop building on reiser.
Testing
The project includes comprehensive testing with 90+ Rust tests and 15+ correctness benchmarks comparing against fd.
Note: Miri validation (Rust's undefined behaviour detector) cannot be used due to the extensive libc calls. Intensive testing and valgrind validation are used instead. See the valgrind script here
- Rust tests: Available here
- Shell scripts clone the LLVM repository to provide an accurate testing environment
- Tests run via GitHub Actions on all supported platforms
Running the Full Test Suite:
TMP_DIR=""
# If on Android, ensure the script is executable
if ; then
fi
This executes a comprehensive suite of internal library tests, CLI tests, and benchmarks.
Performance Benchmarks
The benchmarks are fully repeatable using the testing code above and cover file type filtering, extension matching, file sizes, and many other scenarios. The following results were obtained on a local system and the LLVM repo to provide realistic usage examples: (These are tests done via hyperfine and summarised to save space here.)
(*TESTED ON LINUX, other OS's will (probably) be lower due to specific linux optimisations)
(I cannot test accurately on qemu due to virtualisation overhead and I do not have a mac)
Rough tests indicate a significant 50%+ speedup on BSD's/Illumos/Solaris but macos has less optimisations, perhaps testing in QEMU is not ideal for mac!
| | | ) | ) | ) | |
|||||||
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
| ) | | | | | |
Average speedup: 1.8× faster
Distinctions from fd/find
Symlink resolution in my method differs from fd and find. Although I generally advise against following symlinks, the option exists for completeness.
When following symlinks, behaviour will vary slightly. For example, fd can enter infinite loops with recursive symlinks (see recursive_symlink_fs_test.sh) Available here whereas my implementation prevents hangs. It may, however, return more results than expected.
To avoid issues, use --same-file-system when traversing symlinks. Both fd and find also handle them poorly without such flags. My approach ensures the program always terminates safely, even in complex directories like ~/.steam, ~/.wine, /sys, and /proc.
The flag -I includes directories in output(as opposed to ignore files), I will change this in future.
Technical Highlights
Key Optimisations
-
**getdents64: Optimised the Linux/Android-specific directory reading by significantly reducing the number of getdents system calls.
-
find_char_in_word/find_last_char_in_word: Locates the first/last occurrence of a byte in a 64-bit word using SWAR (SIMD within a register), implemented as a const function
-
Compile-time colour mapping: A compile-time perfect hashmap for colouring file paths, defined in a separate repository
Constant-Time Directory Entry Processing
The following function provides an elegant solution to avoid branch mispredictions/SIMD instructions during directory entry parsing (a performance-critical loop):
Check source code for further explanation in utils.rs**
// Computational complexity: O(1) - truly constant time
// Used mostly on Linux type systems
// SIMD within a register, so no architecture dependence
//http://www.icodeguru.com/Embedded/Hacker%27s-Delight/043.htm
pub const unsafe
Why?
I started this project because I found find slow and wanted to learn how to interface directly with the kernel. What began as a random experiment turned out to be a genuinely useful tool - one I'll probably use for the rest of my life, which is much more interesting than a project I'd just create and forget about.
At the core, this is about learning.
When I began I had barely used Linux/Rust for a few months, I didn't even know C, so there are some rough ABI edges. But along the way, I've picked up low-level skills and this project has been really useful for that!
Performance Motivation
Even though fdf is already faster than fd in all cases, I'm planning to experiment with filtering before allocation(I don't stop at good enough!) Rust's std::fs has some inefficiencies, too much heap allocation, file descriptor manipulation, constant strlen calculations, usage of readdir (not optimal because it implicitly stat calls every file it sees!). Rewriting all of it using libc was the ideal way to bypass that and learn in the process.
Notably the standard library will keep file descriptors open(UNIX specific) until the last reference to the inner ReadDir disappears, because UNIX has a limit on open file descriptors, this can cause a form of 'rate limiting', not ideal.
It will also tend to call 'stat' style calls heavily which is very! inefficient
(I do have a shell script documenting syscall differences here(it's crude but it works well)) Available here
Development Philosophy
** Feature stability before breakage - I won't push breaking changes or advertise this anywhere until I've got a good baseline.
** Open to contributions - Once the codebase stabilises, I welcome others to add features if they're extremely inclined anyway!
In short, this project is a personal exploration into performance, low-level programming, and building practical tools - with the side benefit of making a useful tool and learning a crazy amount!
Acknowledgements/Disclaimers
I've directly taken code from fnmatch-regex, found at the link and modified it so I could convert globs to regex patterns trivially, this simplifies the string filtering model by delegating it to rust's extremely fast regex crate. Notably I modified it because it's quite old and has dependencies I was able to remove
(I have emailed and received approval from the author above)
I've also done so for some SWAR tricks from the standard library (see link) I've found a much more rigorous way of doing some bit tricks via this.
I additionally emailed the author of memchr and got some nice tips, great guy, someone I respect whole heartedly!
Future Plans
Feature Enhancements (Planned)
More elaborate improvements/fixes discussed at this link
API cleanup, currently the CLI is the main focus but I'd like to fix that eventually!
POSIX Compliance: Mostly done, I don't expect to extend this beyond Linux/BSD/MacOS/Illumos/Solaris/Android (the other ones are embedded mostly, correct me if i'm wrong!), I have tentative work for other OS'es, but ultimately it is hard to even emulate these! Such as l4re,horizon etc. Some OS'es are plainly not supported, such as vita/nuttx (due to lacking inodes) and hurd (due to unbounded filenames)
Ultimately, these are an extremely fringe usecase and I think it is beyond pointless to focus on these.
Platform Expansion
Windows Support: Acknowledged as a significant undertaking an almost entire separate codebase(portability ain't fun), but valuable for both usability and learning Windows internals.
Installation and Usage
# Clone & build
# Optional system install
# Find all JPG files in the home directory (excluding hidden files)
# Find all Python files in /usr/local (including hidden files)
# Null terminated all output instead of newlines, mainly for command passing to other functions
|
# Generate shell completions for Zsh/bash (also supports powershell/fish!)
# For Zsh
# For Bash
## Options
)
)
Potential Future Enhancements
1. io_uring System Call Batching
- Investigate batching of
statand similar operations. - Key challenges:
- No native
getdentssupport inio_uring. - Would require async runtime integration (e.g. Tokio).
- Conflicts with the project’s minimal-dependency design.
- Linux-only feature, making it a low-priority and high-effort addition. I will likely NOT do this
- No native
2. Native Threading Implementation
- Replace the Rayon dependency with a custom threading model. Honestly probably impossible for me to outperform it.
3. Allocation-Optimised Iterator Adaptor
- Implement a filtering mechanism that avoids unnecessary directory allocations.
- Achieved via a closure-based approach triggered during
readdirorgetdentscalls. - Although the cost of allocations doesn't seem too bad, I will look at this again at some point.
- Maybe achieved via a lending iterator type approach? See link for reference