Expand description
KeyHog Scanner: A high-performance, multi-layered secret detection engine.
This crate implements the core scanning logic, combining SIMD pre-filtering, Aho-Corasick literal matching, regex fallback, and ML-based confidence scoring.
Re-exports§
pub use engine::CompiledScanner;pub use error::Result;pub use error::ScanError;pub use hw_probe::HardwareCaps;pub use hw_probe::ScanBackend;pub use hw_probe::probe_hardware;pub use hw_probe::select_backend;pub use types::ScannerConfig;
Modules§
- alphabet_
filter - Alphabet-based bitmask pre-filtering for ultra-fast chunk skipping.
- checksum
- Checksum-aware credential validation.
- compiler
- Logic for compiling detector specifications into an efficient scanning engine.
- confidence
- Confidence scoring: combines multiple signals into a 0.0–1.0 score. Higher confidence means more likely to be a real secret.
- context
- Structural context analysis: understand WHERE in code a potential secret appears.
- decode
- Decode-through scanning: decode base64 and hex strings before pattern matching.
- engine
- Core scanning engine implementation.
- entropy
- Shannon entropy analysis for distinguishing secrets from ordinary text.
- error
- Specialized error types for the scanner engine.
- gpu
- GPU-accelerated batch inference for the MoE classifier via wgpu compute shaders.
- hw_
probe - Hardware capability probing with once-cached results.
- ml_
scorer - ML-based secret scoring with a tiny mixture-of-experts network.
- multiline
- Multi-line string concatenation preprocessor.
- pipeline
- Scanning pipeline logic for different layers (SIMD, AC, Fallback, Entropy).
- resolution
- Match resolution: when multiple detectors match the same region, keep only the most specific, highest-confidence match. Eliminates duplicates.
- types
- Internal types and constants for the scanning engine.
Functions§
- compute_
line_ offsets - Compute line offsets for a block of text.
- find_
companion - Search for a companion pattern near a primary match.
- floor_
char_ boundary - Find the largest char boundary <= index.
- is_
within_ hex_ context - Check if a match is within a hex-encoded context.
- match_
entropy - measure shannon entropy of a byte slice.
- match_
line_ number - Map a byte offset to a line number using pre-computed offsets.
- normalize_
chunk_ data - Normalize scannable text by removing evasion characters and handling homoglyphs.
- normalize_
scannable_ chunk - Pre-process a chunk of text for scanning.
- should_
suppress_ known_ example_ credential - Check if a credential should be suppressed because it is a known example.