fdf
(Jeremy Clarkson voice ) 'Probably the fastest finder you'll find on POSIX for regex/glob matching files (see benchmark proof versus fd*)'
COMPATIBILITY STATE
1.Working on Linux 64bit Tested on Debian/Ubuntu/Arch/Fedora varying versions.
2.Somehow working on Aarch 64 Linux/Android Debian (basically, it works on my phone via termux!) (I didn't need to change anything!)
2.Macos 64bit (Tested on Sonoma)
3.Free/Open/Net/Dragonfly BSD 64bit (Ok, it compiles on these platforms but only tested on freebsd.)
3.Tested on 64bit PPC Linux (Ubuntu)
5.Alpine/MUSL
INTRO
NOT IN A STATE FOR USE/CONTRIBUTION, YE HAVE BEEN WARNED!
I have to change the name first and make the API actually coherent (I haven't tried using it as a crate yet)
As I fix and improve certain features, I will make it open to contributions.
Honestly this is still a hobby project that still needs much work. It works for the subset I've implemented perfectly but it's far from complete.
The CLI is basically an afterthought because I'm focusing on lower levels and going up in functionality, like ascending Plato's cave (increasing abstraction) Essentially ,I add those at the end (make the foundations strong so you do crazy stuff)
It has better performance than fd on equivalent featuresets but fd
has an immense set, of which I'm not going to replicate
Rather that I'm just working on this project for myself because I really wanted to know what happens when you optimally write hardware specific code( and how to write it!)
WHY?
Well, I found find slow, I didn't know fd existed, I didn't expect some random test project to actually be good.
Then finally, the reward is a tool I can use for the rest of my life to find stuff.
Mostly though, I just enjoy learning.
Future plans?
I'd probably just keep the CLI stuff simple
Add some extra metadata filters (because i get a lot of metadata for cheap via specialisation!)
Add POSIX compatibility in general ( illumos/solaris QEMU isn't straight forward, quite esoteric)
Add Windows... Well, This would take a fundamental rewrite because of architectural differences, I might do it. (Who uses the terminal on windows?)
Fundamentally I want to develop something that's simple to use (doing --help shouldnt give you the bible) ..and exceedingly efficient.
Cool bits
Speed! In every benchmark so far tested, it's ranging from a minimum of 1.2x and a maximum of 2x as fast~~ (really approximating here) as fast for regex/glob feature sets, check the benchmark!
dirent_const_strlen const fn, get strlen from a dirent64 in constant time with no branches (benchmarks below)
cstr! macro: use a byte slice as a pointer (automatically initialise memory, add null terminator for FFI use) or alternatively cstr_n (MEANT FOR FILEPATHS!)
Below is a compile-time hash map of file extensions to their corresponding ANSI color codes based on the LS_COLORS environment variable. defined as
pub static LS_COLOURS_HASHMAP:
(it's defined in another github repo of mine at https://github.com/alexcu2718/compile_time_ls_colours)
SHORTSTRINGS(under 8 chars)
SEE BENCHMARKS IN const_str_benchmark.txt for better details and ideally read my benches/dirent_bench.rs
)
)
)
MAXLENGTHSTRINGS (255)
)
) #interesting!
)
//The code is explained better in the true function definition (this is crate agnostic)
//This is the little-endian implementation, see crate for modified version for big-endian
// Only used on Linux systems, OpenBSD/macos systems store the name length trivially.
pub const unsafe
```bash
| Command | Mean | Min | Max | Relative |
| `fdf . '/home/alexc' -HI --type l` | 259.2 ± 5.0 | 252.7 | 267.5 | 1.00 |
| `fd -HI '' '/home/alexc' --type l` | 418.2 ± 12.8 | 402.2 | 442.6 | 1.61 ± 0.06 |
| Command | Mean | Min | Max | Relative |
| `fdf -HI --extension 'jpg' '' '/home/alexc'` | 292.6 ± 2.0 | 289.5 | 295.8 | 1.00 |
| `fd -HI --extension 'jpg' '' '/home/alexc'` | 516.3 ± 5.8 | 509.1 | 524.1 | 1.76 ± 0.02 |
Requirements
- Linux/Macos/BSD only: Specific posix syscalls.
- 64 bit tested only(+PPC BE64bit)
Installation
# Clone & build
# Optional system install
)
# Find all files containing "config" in the current directory and subdirectories (case-insensitive and excluding directories+hidden files)
# Find all JPG files in the home directory (excluding hidden files)
# Find all Python files in /usr/local (including hidden files)
## Options (T)
)
)
)
TODO LIST (Maybe):
-- Arena Allocator (potentially): Written from scratch. See Microsoft's edit for a nice example: https://github.com/microsoft/edit/tree/main/src/arena
-- io_uring for Batched Syscalls: e.g., batched open/read operations. This will be extremely challenging.
-- String Interning: Trivial for ASCII, but efficient Unicode handling is an entirely different beast.
-- Threading Without Rayon: My attempts have come close, but aren’t quite there yet. I'll rely on Rayon for now until I can come up with a smart way to implement an appropriate work-distributing algorithm. TODO!
-- Iterator Adaptor + Filter: Some kind of adaptor that avoids a lot of unnecessary allocations on non-directories.
-- Syscall Limits: I think there’s ultimately a hard limit on syscalls.
I've experimented with an early Zig io_uring + getdents implementation — but it's well outside my comfort zone (A LOT).
I’ll probably give it a go anyway (if possible).
**** THIS IS NOT FINISHED. I have no idea what the long-term plans are — I'm just trying to make stuff go fast and learn, OK?