Skip to main content

ScannerConfig

Struct ScannerConfig 

Source
pub struct ScannerConfig {
Show 20 fields pub max_decode_depth: usize, pub validate_decode: bool, pub entropy_enabled: bool, pub entropy_threshold: f64, pub entropy_in_source_files: bool, pub entropy_ml_authoritative: bool, pub generic_keyword_low_entropy: bool, pub ml_enabled: bool, pub ml_weight: f64, pub min_confidence: f64, pub unicode_normalization: bool, pub max_decode_bytes: usize, pub max_matches_per_chunk: usize, pub scan_comments: bool, pub multiline: MultilineConfig, pub known_prefixes: Vec<String>, pub secret_keywords: Vec<String>, pub test_keywords: Vec<String>, pub placeholder_keywords: Vec<String>, pub penalize_test_paths: bool,
}
Expand description

Configuration for the scanner’s decoding and processing heuristics.

Fields§

§max_decode_depth: usize

Maximum recursion depth for decode-through (base64, hex, etc.)

§validate_decode: bool

Validate decoded strings (e.g. check if decoded base64 is UTF-8)

§entropy_enabled: bool

Enable entropy-based detection

§entropy_threshold: f64

Threshold for entropy-based detection

§entropy_in_source_files: bool

Enable entropy-based detection in source code files

§entropy_ml_authoritative: bool

Route entropy-fallback candidates through the MoE with the model AUTHORITATIVE (no entropy-magnitude floor) instead of the bare entropy heuristic. Mirrors keyhog_core::config::ScanConfig::entropy_ml_authoritative and the CLI --no-entropy-ml-scoring opt-out. No-op unless both entropy_enabled and ml_enabled are set. See apply_ml_batch_scores and scan_entropy_fallback.

§generic_keyword_low_entropy: bool

Admit generic keyword-bridge values (PASSWORD=, *_PASS=, secret:, api_key= …) on the relaxed generic-keyword-secret entropy floor instead of the high generic-secret floor. Mirrors keyhog_core::config::ScanConfig::generic_keyword_low_entropy and the CLI --no-keyword-low-entropy opt-out. The keyword key is the evidence; precision is carried by the MoE + shape filters. See scan_generic_assignments.

§ml_enabled: bool

Enable ML-based confidence scoring

§ml_weight: f64

ML weight for confidence scoring, 0.0-1.0

§min_confidence: f64

Minimum confidence threshold for matches

§unicode_normalization: bool

Enable Unicode normalization

§max_decode_bytes: usize

Maximum bytes for decode-through processing

§max_matches_per_chunk: usize

Maximum matches to collect per chunk before stopping. Prevents OOM on extremely noisy files.

§scan_comments: bool

When true, credentials inside source-code comments are treated as first-class findings (no confidence downgrade, no comment-context multiplier). Mirrors keyhog_core::config::ScanConfig::scan_comments and the CLI’s --scan-comments flag. See that field’s doc for why the default is off.

§multiline: MultilineConfig

Configuration for multiline concatenation

§known_prefixes: Vec<String>

Known secret prefixes used to boost confidence.

§secret_keywords: Vec<String>

Keywords indicating a secret context (e.g. “api_key”, “token”).

§test_keywords: Vec<String>

Keywords indicating a test/mock context (e.g. “test”, “fake”).

§placeholder_keywords: Vec<String>

Keywords indicating a placeholder value (e.g. “change_me”, “todo”).

§penalize_test_paths: bool

Apply test/example path confidence and hard-suppression heuristics. The CLI disables this for --no-suppress-test-fixtures.

Implementations§

Source§

impl ScannerConfig

Source

pub const HIGH_PRECISION_MIN_CONFIDENCE: f64 = 0.85

Confidence floor for ScannerConfig::high_precision. Distinct from the canonical ScanConfig::default() floor (0.40) on purpose: precision mode trades recall for a near-zero false-positive rate at mass-scan scale.

Source

pub fn fast() -> Self

Source

pub fn thorough() -> Self

Source

pub fn high_precision() -> Self

High-precision mass-scan preset: minimise false positives at the cost of some recall, for scanning huge corpora where every FP is expensive to triage. Fully offline and fast (no ML, no entropy sweep, shallow decode).

  • entropy_enabled = false: generic high-entropy matching is the single largest FP source; precision mode drops it entirely.
  • ml_enabled = true (inherited): ML is the confidence discriminator that lifts genuine secrets over the high floor while leaving FP-shaped tokens below it. Disabling it would crater the scores the 0.85 bar relies on, so precision KEEPS ML (this mode trades recall for precision, not for speed — use --fast when speed is the goal).
  • min_confidence = HIGH_PRECISION_MIN_CONFIDENCE (0.85): combined with the engine’s checksum policy (valid token → floored 0.9, invalid → capped 0.1) and clamped over every detector’s self-declared floor, this bar admits checksum-validated tokens and strong ML-scored findings while dropping checksum-failures and weak-signal matches.
  • max_decode_depth = 1: deep-decoded payloads are a FP source at scale.

penalize_test_paths stays on (the default) to suppress fixture-shaped hits. A --min-confidence override still layers on top of this preset.

Source

pub fn min_confidence(self, min_confidence: f64) -> Self

Source

pub fn sanitise(&mut self)

Clamp every float field into its valid range and replace any NaN with a safe default. A user-supplied --min-confidence=-5.0 or a corrupt config TOML feeding min_confidence = nan would otherwise NaN-infect the confidence-comparison path and silently drop every finding (NaN comparisons are always false, so conf < min_confidence is false, but conf >= min_confidence is also false, behaviour-dependent on the call site).

Idempotent - sanitising an already-sane config is a no-op. Called inside From<ScanConfig> so any path that constructs a ScannerConfig from a user-influenced source pays this once at config-build time.

Trait Implementations§

Source§

impl Clone for ScannerConfig

Source§

fn clone(&self) -> ScannerConfig

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for ScannerConfig

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for ScannerConfig

Source§

fn default() -> Self

Returns the “default value” for a type. Read more
Source§

impl From<ScanConfig> for ScannerConfig

Source§

fn from(config: ScanConfig) -> Self

Converts to this type from the input type.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
where ST: ?Sized, DT: ?Sized,

Source§

impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
where ST: ?Sized, DT: ?Sized,

Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> Downcast<T> for T

Source§

fn downcast(&self) -> &T

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Read<Exclusive, BecauseExclusive> for T
where T: ?Sized,

Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> Upcast<T> for T

Source§

fn upcast(&self) -> Option<&T>

Source§

impl<T> WasmNotSend for T
where T: Send,

Source§

impl<T> WasmNotSendSync for T

Source§

impl<T> WasmNotSync for T
where T: Sync,

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more