Skip to main content

FileIndex

Struct FileIndex 

Source
pub struct FileIndex {
    pub entries: Vec<FileEntry>,
    /* private fields */
}
Expand description

The indexed result of one filesystem walk. All rules share this index — the walk happens once per alint check invocation.

path_set is a lazy HashSet<Arc<Path>> over file entries. Built once on first call to FileIndex::contains_file / FileIndex::file_path_set and re-used across all subsequent lookups. Cross-file rules that ask “does this exact path exist?” (most importantly file_exists instantiated by for_each_dir) hit the set instead of doing an O(N) linear scan over every entry. At 1M files in a 5,000-package monorepo, this turns the fan-out shape from O(D × N) = 5 × 10⁹ ops to O(D) = 5,000 lookups.

parent_to_children (v0.9.8) is a second lazy index — for each directory, the indices of its DIRECT children in entries. Cross-file rules that previously scanned all entries per matched dir (dir_only_contains, dir_contains) now lookup children_of(dir) (O(1)) instead of doing a per-dir O(N) scan. Closes the v0.9.5 → v0.9.8 cliff: at 1M files / 5K dirs, dir_only_contains drops from 5 billion path-parent comparisons to ~1 million.

Fields§

§entries: Vec<FileEntry>

Implementations§

Source§

impl FileIndex

Source

pub fn from_entries(entries: Vec<FileEntry>) -> Self

Construct a FileIndex from raw entries. Equivalent to FileIndex { entries, ..Default::default() } but spelled out so test/bench fixtures don’t have to know about the internal lazy path_set field.

Source

pub fn files(&self) -> impl Iterator<Item = &FileEntry>

Source

pub fn dirs(&self) -> impl Iterator<Item = &FileEntry>

Source

pub fn total_size(&self) -> u64

Source

pub fn file_path_set(&self) -> &HashSet<Arc<Path>>

Get (lazily building on first call) the hash-indexed set of all file (non-dir) paths in this index. Subsequent calls return the cached set. Concurrent first calls are safe (OnceLock ensures a single initialiser wins).

Source

pub fn contains_file(&self, rel: &Path) -> bool

O(1) “does this exact relative path exist as a file?” query. Triggers the lazy build of the path set on first call. Use this instead of iterating files() whenever a rule needs to check a fully-qualified path — at scale, the hash lookup is several orders of magnitude faster.

Source

pub fn find_file(&self, rel: &Path) -> Option<&FileEntry>

Find a file entry by its exact relative path. Uses the lazy path set for the existence check, then re-scans entries linearly to return the matching &FileEntry (entries are pinned, but the set stores Arc<Path> keys not direct entry references). Most callers want the boolean answer — prefer FileIndex::contains_file.

Source

pub fn children_of(&self, dir: &Path) -> &[usize]

Direct children of dir, as indices into Self::entries. Triggers the lazy build of the parent → children map on first call across any directory.

Returns an empty slice when dir has no children or isn’t in the index. Indices are stable across the lifetime of &self — use them via &self.entries[i] at the call site to dereference.

Build cost: O(N) (one pass over entries, one HashMap insert per entry). Lookup cost: O(1) HashMap probe. Replaces the O(D × N) for dir in dirs() { for file in files() { is_direct_child(file, dir) ... } } shape that dir_only_contains and dir_contains previously used. At 1M files × 5K matched dirs, that’s a 5,000× reduction in total comparison count.

Source

pub fn file_basenames_of<'a>( &'a self, dir: &Path, ) -> impl Iterator<Item = &'a str> + 'a

Direct file children’s basenames under dir. Filters out subdirectories — pure file basenames only. Returns an iterator borrowing into entries[i].path for each match; no allocation per call (the underlying Path::file_name() returns a borrow into the Arc<Path>).

Built on top of Self::children_of. Cross-file rules like dir_contains whose hot path is “does this dir have any file matching this basename matcher?” use this to skip the per-call path.file_name().and_then(|s| s.to_str()) extraction and the entries.iter().any(...) scan in one shot.

Files whose basename isn’t valid UTF-8 are silently dropped from the iterator — same shape as the existing path-string consumers.

Source

pub fn descendants_of<'a>( &'a self, dir: &'a Path, ) -> impl Iterator<Item = &'a FileEntry> + 'a

All descendants under dir (files + subdirs), recursive, depth-first. Built on top of Self::children_of; does NOT materialise the full subtree as a Vec (root descendants = every entry would cost O(N) memory, defeating the lazy design). Yields entries one at a time so callers can short-circuit cleanly via take_while / find / etc.

Cycle defense: a stack-based walk with no per-iteration cycle check. The walker (crate::walk) calls WalkBuilder::follow_links(true) to traverse through symlinks, and the underlying ignore crate carries cycle detection — an ancestor-self symlink emits an error and the walker continues without recursing. The entries vec is therefore acyclic by construction; adding a per- step cycle check would cost ~10 ns per yielded entry for a guarantee that’s already established at walker time.

Trait Implementations§

Source§

impl Debug for FileIndex

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for FileIndex

Source§

fn default() -> FileIndex

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more