Skip to main content

Module types

Module types 

Source
Expand description

Shared types for the scanner engine. Internal types and constants for the scanning engine.

Re-exports§

pub use crate::scanner_config::ScanState;
pub use crate::scanner_config::ScannerConfig;
pub use crate::scanner_config::MlPendingMatch;

Structs§

CompiledCompanion
An optional compiled companion pattern for a detector.
CompiledPattern
A compiled entry: one pattern from one detector. The regex is compiled lazily on first use - see LazyRegex.
LazyRegex
A detector pattern whose Regex is compiled on first use, not at load.

Constants§

FIRST_CAPTURE_GROUP_INDEX
FIRST_LINE_NUMBER
FULL_MATCH_INDEX
Minimum AC literal prefix length. Shorter prefixes (e.g., “1”, “x”, “_”) match too many positions and degrade Aho-Corasick throughput.
HEX_CONTEXT_RADIUS_CHARS
How many characters around a hex match to inspect for structural context (assignment operators, quotes, keywords).
LARGE_FALLBACK_SCAN_THRESHOLD
MAX_HEX_CONTEXT_SEPARATORS
Maximum non-hex separators (colons, dashes) tolerated within a hex context window before the match is treated as a non-hex string.
MAX_ML_CACHE_BYTES
MAX_ML_CACHE_ENTRIES
MAX_SCAN_CHUNK_BYTES
Maximum bytes scanned in a single chunk. Files larger than this are split into overlapping windows. 1 MiB keeps peak RSS predictable under parallel scanning with rayon (N threads × 1 MiB per chunk = bounded memory).
MAX_WINDOW_DEDUP_ENTRIES
Hard cap on the dedup set to prevent unbounded memory growth when scanning repositories with millions of duplicate credential-like strings.
MIN_FALLBACK_LINE_LENGTH
Minimum line length considered for fallback pattern scanning. Lines shorter than 8 bytes cannot contain a credential prefix plus a meaningful secret.
MIN_HEX_CONTEXT_DIGITS
Minimum hex digits required in the context window around a match to trigger hex-aware false-positive suppression.
MIN_HEX_DIGITS_IN_MATCH
MIN_HEX_MATCH_LEN
Minimum length for a standalone hex string to qualify as a potential secret. Shorter hex runs (e.g., CSS colors like #ff00ff) are too common.
MIN_LITERAL_PREFIX_CHARS
ML_CONTEXT_RADIUS_LINES
PREVIOUS_LINE_DISTANCE
REGEX_SIZE_LIMIT_BYTES
Default per-regex AST + lazy-DFA-cache size limit. 1 MiB is large enough for complex detectors while preventing pathological patterns from consuming unbounded memory during regex compilation.
WINDOW_OVERLAP_BYTES
Overlap between adjacent scan windows when a file exceeds MAX_SCAN_CHUNK_BYTES. Must be larger than the longest secret the scanner can detect to avoid missing secrets that straddle a chunk boundary. 128 KiB covers PEM-encoded RSA-8192 keys, large JWTs, and multi-line concatenated secrets with generous margin.

Functions§

regex_dfa_limit
The effective per-regex DFA size limit: the override if set, else the compiled default REGEX_SIZE_LIMIT_BYTES.
set_regex_dfa_limit
Override the per-regex DFA size limit for this process. Call before scanning. 0 resets to the compiled default. Tier-A config knob (default → TOML → CLI).

Type Aliases§

ScannerPreprocessedText