Skip to main content

Module gpu_decode_scan

Module gpu_decode_scan 

Source
Expand description

Fused GPU decode→scan: base64 and hex decode + Aho-Corasick match in a single GPU dispatch.

§Motivation

keyhog’s CPU decode pipeline (decode/pipeline.rs) extracts base64/hex blobs, decodes them on the CPU, and re-scans the decoded output through the GPU literal-set engine. This creates a full CPU→GPU round-trip per encoded chunk. Vyre’s fused decode builders compose decode + AC-scan into a single vyre::Program where decoded bytes never leave VRAM:

encoded bytes (host)
  ↓  upload once
  ↓  base64_decode_then_aho_corasick (one GPU dispatch)
  ↓  readback match triples only
host match offsets

Eliminates ~4 GiB of throwaway allocations on a 1 GiB scan with 512 × 2 MiB shards.

§Architecture

The fused programs are built at scanner compile time alongside the GpuLiteralSet. They share the same DFA transition/accept tables (from the literal-set AC automaton) but prepend a decode stage that transforms the encoded input in-place before the AC walk.

Two encoding variants are supported:

  • Base64 via vyre_libs::decode::base64_decode_then_aho_corasick
  • Hex via vyre_libs::decode::hex_decode_then_aho_corasick

§Fallback

If GPU dispatch fails (no backend, device lost, program compilation error), the caller falls back to the existing CPU decode pipeline. This module never panics on GPU failure.

Structs§

FusedDecodeScanPrograms
Compiled fused decode+scan programs, lazily built and cached.

Enums§

FusedEncoding
Supported encoding types for fused GPU decode→scan.

Functions§

build_fused_programs
Build fused decode→scan programs from the same DFA tables the GpuLiteralSet uses.
detect_encoding
Detect likely encoding of a byte slice.