Module engine

Expand description

Core scan execution engine. Core scanning engine implementation.

Re-exports§

boundary: Cross-chunk window-boundary secret reassembly.
gpu_decode_scan: Fused GPU decode→scan: base64 and hex decode + Aho-Corasick match in a single GPU dispatch.
gpu_program_fusion: GPU program fusion - collapses multiple sequential vyre Program dispatches into a single fused program for single-GPU-dispatch execution.
gpu_regex_dfa: GPU RegexDfaPipeline - regex sets compiled through DFA subset construction into O(1)/byte Aho-Corasick scanning.
segment_attribution: Attribute coalesced scanner matches back to logical input segments.

AcConstPacks: Cached per-process AC-kernel input constants - pre-packed LE byte streams for the four DFA-shape inputs the AC bounded-ranges kernel reads on every dispatch. Separate from GpuConstPacks because the AC kernel binds different fields (dfa.transitions, dfa.output_offsets, dfa.output_records, pattern_lengths).
CompiledScanner
GpuConstPacks: Cached per-process GPU input constants - pre-packed LE byte streams for the four pattern-shape inputs the GpuLiteralSet kernel reads on every dispatch. Filled on first scan, borrowed thereafter.
LiteralMatch: A byte-range match emitted by vyre scanning engines.

GpuInitPolicy
GpuPhase1Output: Two-phase output of CompiledScanner::scan_coalesced_gpu_phase1.
MlScoreResult

AC_GPU_MAX_MATCHES_PER_DISPATCH: Output buffer cap for the AC GPU kernel, per shard dispatch.
MEGASCAN_INPUT_LEN: Backwards-compatible alias preserved for any external consumer that referenced the old constant by name. New code should call megascan_input_len so the host’s GPU VRAM scales the dispatch.
MEGASCAN_INPUT_LEN_DEFAULT: Maximum input buffer length the MegaScan RulePipeline is pre-compiled for. Chosen to match the orchestrator’s BATCH_BYTES_BUDGET so any normal coalesced batch fits the pre-built pipeline without needing recompile-per-batch. Batches larger than this fall back to the literal-set path.

build_rule_pipeline: Compile a RulePipeline (vyre’s regex multimatch path) for the given detector regex sources, sized for input_len bytes. Uses vyre’s regex_compile::build_rule_pipeline_from_regex so each pattern is parsed via regex_syntax (with unicode(false) / utf8(false) - ASCII byte automaton) and lowered to the same transition + epsilon tables RulePipeline::scan expects.
coalesce_chunks: Build the contiguous GPU input buffer from chunks.
floor_char_boundary
line_number_for_offset
megascan_input_len: VRAM-adaptive megascan input length. Bigger buffers mean fewer device dispatches per multi-TB scan; each kernel launch is a fixed ~50-300 µs cost regardless of payload, so doubling the input halves dispatch overhead. Capped by host VRAM (input + transition tables + match output must fit) and by a 1 GiB upper bound so the pre-compile time stays bounded.
next_window_offset
record_window_match
rule_pipeline_cached: Compile-or-load a RulePipeline for the given regex set. First call hits the on-disk cache; misses recompile and re-cache. Returns Err when the regex compile itself fails (state-cap overflow or unsupported regex syntax) - the caller is expected to log + fall back to the literal-set GPU dispatch in that case.
window_chunk
window_end_offset