pub struct ScanConfig {Show 21 fields
pub min_confidence: f64,
pub max_decode_depth: usize,
pub entropy_enabled: bool,
pub entropy_in_source_files: bool,
pub entropy_ml_authoritative: bool,
pub generic_keyword_low_entropy: bool,
pub entropy_threshold: f64,
pub min_secret_len: usize,
pub max_file_size: u64,
pub dedup: DedupScope,
pub ml_enabled: bool,
pub ml_weight: f64,
pub unicode_normalization: bool,
pub validate_decode: bool,
pub max_decode_bytes: usize,
pub max_matches_per_chunk: usize,
pub scan_comments: bool,
pub known_prefixes: Vec<String>,
pub secret_keywords: Vec<String>,
pub test_keywords: Vec<String>,
pub placeholder_keywords: Vec<String>,
}Expand description
Configuration for a scan run.
Fields§
§min_confidence: f64Minimum confidence (0.0 to 1.0) required to report a finding.
max_decode_depth: usizeMaximum recursive decoding depth (e.g. Base64(Hex(URL(secret)))).
entropy_enabled: boolWhether to enable Shannon entropy analysis for unknown high-entropy strings.
entropy_in_source_files: boolWhether to enable entropy analysis even in standard source code files.
When the entropy fallback fires, score its candidates through the MoE
with the model AUTHORITATIVE (the entropy magnitude is NOT a confidence
floor) instead of emitting the bare entropy heuristic. Default on: on the
real-distribution-trained model this is a recall-safe precision win — the
model scores real high-entropy secrets high and structured non-secrets
(FQDNs, git SHAs, base64 blobs) low, so FPs fall below the report floor
while genuine recall is preserved. Opt out with --no-entropy-ml-scoring.
No-op when entropy_enabled or ml_enabled is false.
generic_keyword_low_entropy: boolWhen the generic keyword bridge (PASSWORD=, *_PASS=, secret:,
api_key= …) extracts a value, admit it on a far lower entropy floor
(the generic-keyword-secret base, ~1.5 bits) than the bare
generic-secret path (2.8/3.2/3.5). The credential KEYWORD in the key is
the evidence; precision is carried by the MoE + shape filters, not by
entropy. Default on: this is what lets keyhog surface the real-world
low-entropy credentials (config passwords, *_PASS= values) that pin
CredData recall near zero when gated on entropy alone. Opt out with
--no-keyword-low-entropy to restore the high-entropy-only generic gate.
No-op unless the keyword bridge fires.
entropy_threshold: f64Shannon entropy threshold (typical secrets are 4.5+).
min_secret_len: usizeMinimum length for entropy-based secret detection.
NOTE: not yet read by the live scan. From<ScanConfig> for ScannerConfig does not carry this field; the entropy length
gate currently uses the engine’s own length constants. Setting
it in a deserialized config is a no-op until a reader is wired
in. See the From impl on ScannerConfig for the canonical
list of carried vs uncarried fields.
max_file_size: u64Maximum file size to scan (bytes). Large files are skipped or sampled.
NOTE: not read here on the live path. The effective cap is set
at the source walker (FilesystemSource::with_max_file_size,
fed from ScanArgs.max_file_size); this field is retained for
the canonical config surface but is not carried into
ScannerConfig.
dedup: DedupScopeDeduplication strategy.
NOTE: not read here on the live path. The effective scope comes
from ScanArgs.dedup and is applied by the verifier via
DedupScope; this field is not carried into ScannerConfig.
ml_enabled: boolWhether to enable ML-based probabilistic gating.
ml_weight: f64Weight given to the ML score (0.0 to 1.0).
unicode_normalization: boolWhether to normalize Unicode characters before scanning.
validate_decode: boolWhether to validate decoded strings (e.g. that decoded base64 is UTF-8) before recursing into them.
max_decode_bytes: usizeMaximum bytes allowed from recursive decoding. Same field name on
ScannerConfig so From<ScanConfig> is a 1:1 carry, not a rename.
max_matches_per_chunk: usizeMaximum matches allowed per chunk to prevent OOM.
scan_comments: boolWhen true, credentials inside source-code comments
(//, #, /* */, ) get the same confidence treatment as
credentials in regular code. Default false - comment context
downgrades confidence on the theory that examples are the
common case. CLI exposes this as --scan-comments; opt-in
because the rate of EXAMPLE secrets pasted into doc comments
vastly outweighs the rate of real ones.
known_prefixes: Vec<String>List of common secret prefixes to prioritize.
secret_keywords: Vec<String>List of keywords that strongly indicate a secret.
test_keywords: Vec<String>Keywords used in test environments.
placeholder_keywords: Vec<String>Keywords for placeholders and documentation.
Implementations§
Source§impl ScanConfig
impl ScanConfig
Sourcepub fn validate(&self) -> Result<(), ConfigError>
pub fn validate(&self) -> Result<(), ConfigError>
Validate the configuration parameters.
Trait Implementations§
Source§impl Clone for ScanConfig
impl Clone for ScanConfig
Source§fn clone(&self) -> ScanConfig
fn clone(&self) -> ScanConfig
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for ScanConfig
impl Debug for ScanConfig
Source§impl Default for ScanConfig
impl Default for ScanConfig
Source§impl<'de> Deserialize<'de> for ScanConfig
impl<'de> Deserialize<'de> for ScanConfig
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Auto Trait Implementations§
impl Freeze for ScanConfig
impl RefUnwindSafe for ScanConfig
impl Send for ScanConfig
impl Sync for ScanConfig
impl Unpin for ScanConfig
impl UnsafeUnpin for ScanConfig
impl UnwindSafe for ScanConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> DeserializeOwned for Twhere
T: for<'de> Deserialize<'de>,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more