Skip to main content

ScanState

Struct ScanState 

Source
pub struct ScanState {
    pub matches: BinaryHeap<Reverse<RawMatch>>,
    pub credential_interner: HashSet<Arc<str>>,
    pub metadata_interner: HashSet<Arc<str>>,
    pub static_intern: Option<Arc<StaticInterner>>,
    pub ml_score_cache: HashMap<(String, String), f64>,
    pub ml_cache_order: VecDeque<(String, String)>,
    pub ml_cache_bytes: usize,
    pub ml_pending: Vec<MlPendingMatch>,
}
Expand description

Internal state for a single scan operation (tracks matches and ML cache).

Fields§

§matches: BinaryHeap<Reverse<RawMatch>>

Matches collected for this chunk, prioritized by confidence. Uses Reverse to make it a min-heap so we can easily pop the LOWEST confidence.

§credential_interner: HashSet<Arc<str>>

Interner for credentials found in this chunk to save memory on duplicates.

§metadata_interner: HashSet<Arc<str>>

Static string cache for detector metadata. Uses HashSet<Arc<str>> (not HashMap<String, Arc<str>>) so a cache miss allocates ONLY the Arc<str> - the prior shape also allocated a String to serve as the HashMap key, paying twice for what’s a single dedup slot. HashSet::get(&s) works via Arc<str>: Borrow<str>, no allocation on hits.

Hit ONLY by dynamic strings now: the scanner-wide StaticInterner (vyre CHD perfect hash) handles every (detector_id, detector_name, service, source_type) lookup without per-scan allocation.

§static_intern: Option<Arc<StaticInterner>>

Optional reference to the scanner’s frozen static-string interner. When Some, intern_metadata checks here first before falling through to the per-scan metadata_interner. Lock-free on read so concurrent rayon workers share one instance without contention.

§ml_score_cache: HashMap<(String, String), f64>§ml_cache_order: VecDeque<(String, String)>§ml_cache_bytes: usize§ml_pending: Vec<MlPendingMatch>

Detector matches queued for batch ML scoring at the end of the scan.

Implementations§

Source§

impl ScanState

Source

pub fn intern_credential(&mut self, s: &str) -> Arc<str>

Intern a credential string, returning an Arc<str>.

Source

pub fn intern_metadata(&mut self, s: &str) -> Arc<str>

Intern a metadata string (detector_id, name, service, source_type, …).

Lookup order:

  1. Scanner-wide StaticInterner (vyre CHD perfect hash) for detector metadata that’s frozen at scanner construction - O(1), no allocation, no lock contention.
  2. Per-scan metadata_interner HashSet for dynamic strings (file paths, commit SHAs, author names, dates).
Source

pub fn with_static_intern(intern: Arc<StaticInterner>) -> Self

Construct a ScanState that consults the scanner-wide static interner first. Use this from any path that has a &CompiledScanner in scope; falls back to default() for stand-alone unit tests.

Source

pub fn push_match(&mut self, m: RawMatch, limit: usize)

Push a match to the state, maintaining priority and capacity. High-confidence secrets will displace lower-confidence findings.

Source

pub fn into_matches(self) -> Vec<RawMatch>

Drain all matches into a sorted vector. Dedups identical findings (same detector + same credential + same offset) - two engines can produce the same finding for the same pattern (e.g. ac_map’s literal hit + homoglyph fallback variant both fire on plain ASCII because the homoglyph char-class includes the original char). The caller only wants one of them in the result set.

Trait Implementations§

Source§

impl Default for ScanState

Source§

fn default() -> ScanState

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
where ST: ?Sized, DT: ?Sized,

Source§

impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
where ST: ?Sized, DT: ?Sized,

Source§

impl<T> Downcast<T> for T

Source§

fn downcast(&self) -> &T

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Read<Exclusive, BecauseExclusive> for T
where T: ?Sized,

Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> Upcast<T> for T

Source§

fn upcast(&self) -> Option<&T>

Source§

impl<T> WasmNotSend for T
where T: Send,

Source§

impl<T> WasmNotSendSync for T

Source§

impl<T> WasmNotSync for T
where T: Sync,

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more