pub struct FileIndex {
pub entries: Vec<FileEntry>,
/* private fields */
}Expand description
The indexed result of one filesystem walk. All rules share this index —
the walk happens once per alint check invocation.
path_set is a lazy HashSet<Arc<Path>> over file entries.
Built once on first call to FileIndex::contains_file /
FileIndex::file_path_set and re-used across all subsequent
lookups. Cross-file rules that ask “does this exact path
exist?” (most importantly file_exists instantiated by
for_each_dir) hit the set instead of doing an O(N) linear
scan over every entry. At 1M files in a 5,000-package
monorepo, this turns the fan-out shape from O(D × N) =
5 × 10⁹ ops to O(D) = 5,000 lookups.
parent_to_children (v0.9.8) is a second lazy index — for
each directory, the indices of its DIRECT children in
entries. Cross-file rules that previously scanned all
entries per matched dir (dir_only_contains, dir_contains)
now lookup children_of(dir) (O(1)) instead of doing a
per-dir O(N) scan. Closes the v0.9.5 → v0.9.8 cliff: at 1M
files / 5K dirs, dir_only_contains drops from 5 billion
path-parent comparisons to ~1 million.
Fields§
§entries: Vec<FileEntry>Implementations§
Source§impl FileIndex
impl FileIndex
Sourcepub fn from_entries(entries: Vec<FileEntry>) -> Self
pub fn from_entries(entries: Vec<FileEntry>) -> Self
Construct a FileIndex from raw entries. Equivalent to
FileIndex { entries, ..Default::default() } but spelled
out so test/bench fixtures don’t have to know about the
internal lazy path_set field.
pub fn files(&self) -> impl Iterator<Item = &FileEntry>
pub fn dirs(&self) -> impl Iterator<Item = &FileEntry>
pub fn total_size(&self) -> u64
Sourcepub fn file_path_set(&self) -> &HashSet<Arc<Path>>
pub fn file_path_set(&self) -> &HashSet<Arc<Path>>
Get (lazily building on first call) the hash-indexed set
of all file (non-dir) paths in this index. Subsequent
calls return the cached set. Concurrent first calls are
safe (OnceLock ensures a single initialiser wins).
Sourcepub fn contains_file(&self, rel: &Path) -> bool
pub fn contains_file(&self, rel: &Path) -> bool
O(1) “does this exact relative path exist as a file?”
query. Triggers the lazy build of the path set on first
call. Use this instead of iterating files() whenever a
rule needs to check a fully-qualified path — at scale,
the hash lookup is several orders of magnitude faster.
Sourcepub fn find_file(&self, rel: &Path) -> Option<&FileEntry>
pub fn find_file(&self, rel: &Path) -> Option<&FileEntry>
Find a file entry by its exact relative path. Uses the
lazy path set for the existence check, then re-scans
entries linearly to return the matching &FileEntry
(entries are pinned, but the set stores Arc<Path> keys
not direct entry references). Most callers want the
boolean answer — prefer FileIndex::contains_file.
Sourcepub fn children_of(&self, dir: &Path) -> &[usize]
pub fn children_of(&self, dir: &Path) -> &[usize]
Direct children of dir, as indices into Self::entries.
Triggers the lazy build of the parent → children map on
first call across any directory.
Returns an empty slice when dir has no children or isn’t
in the index. Indices are stable across the lifetime of
&self — use them via &self.entries[i] at the call site
to dereference.
Build cost: O(N) (one pass over entries, one HashMap
insert per entry). Lookup cost: O(1) HashMap probe.
Replaces the O(D × N) for dir in dirs() { for file in files() { is_direct_child(file, dir) ... } } shape that
dir_only_contains and dir_contains previously used.
At 1M files × 5K matched dirs, that’s a 5,000× reduction
in total comparison count.
Sourcepub fn file_basenames_of<'a>(
&'a self,
dir: &Path,
) -> impl Iterator<Item = &'a str> + 'a
pub fn file_basenames_of<'a>( &'a self, dir: &Path, ) -> impl Iterator<Item = &'a str> + 'a
Direct file children’s basenames under dir. Filters out
subdirectories — pure file basenames only. Returns an
iterator borrowing into entries[i].path for each match;
no allocation per call (the underlying Path::file_name()
returns a borrow into the Arc<Path>).
Built on top of Self::children_of. Cross-file rules
like dir_contains whose hot path is “does this dir have
any file matching this basename matcher?” use this to skip
the per-call path.file_name().and_then(|s| s.to_str())
extraction and the entries.iter().any(...) scan in one
shot.
Files whose basename isn’t valid UTF-8 are silently dropped from the iterator — same shape as the existing path-string consumers.
Sourcepub fn descendants_of<'a>(
&'a self,
dir: &'a Path,
) -> impl Iterator<Item = &'a FileEntry> + 'a
pub fn descendants_of<'a>( &'a self, dir: &'a Path, ) -> impl Iterator<Item = &'a FileEntry> + 'a
All descendants under dir (files + subdirs), recursive,
depth-first. Built on top of Self::children_of; does
NOT materialise the full subtree as a Vec (root descendants
= every entry would cost O(N) memory, defeating the lazy
design). Yields entries one at a time so callers can
short-circuit cleanly via take_while / find / etc.
Cycle defense: a stack-based walk with no per-iteration
cycle check. The walker (crate::walk) calls
WalkBuilder::follow_links(true) to traverse through
symlinks, and the underlying ignore crate carries
cycle detection — an ancestor-self symlink emits an error
and the walker continues without recursing. The entries
vec is therefore acyclic by construction; adding a per-
step cycle check would cost ~10 ns per yielded entry for
a guarantee that’s already established at walker time.
Trait Implementations§
Auto Trait Implementations§
impl !Freeze for FileIndex
impl RefUnwindSafe for FileIndex
impl Send for FileIndex
impl Sync for FileIndex
impl Unpin for FileIndex
impl UnsafeUnpin for FileIndex
impl UnwindSafe for FileIndex
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more