Skip to main content

Module pattern

Module pattern 

Source
Expand description

Curated literal + regex ruleset for prompt-injection detection.

§False-positive tradeoff

This corpus legitimately documents prompt injection (it indexes security material about Midnight and LLM tooling). A naive “contains the word ignore” filter would flag half the security docs. Every rule here therefore requires the imperative verb-object structure of an actual attack, not a mere mention. For example we match ignore all previous instructions but a sentence like “this guide explains how attackers ignore safety rules” lacks the (previous|prior|above)\s+(instructions|...) object and does not hit.

Rules run over super::normalize()d text (already lowercased, homoglyph- and base64-folded), so each regex is written against lowercase ASCII. Every hit’s span is mapped back to the ORIGINAL input bytes via super::normalize::Normalized::original_span.

The compiled ruleset is built once via std::sync::LazyLock.

Structs§

PatternMatch
A single rule hit, with the matched substring and its span in ORIGINAL bytes.
PatternResult
The aggregate result of running the ruleset over one input.

Enums§

Technique
The injection technique a rule detects. Stable wire enum (snake_case).

Functions§

detect
Run the curated ruleset over normalize(input) and map every hit’s span back to the original text.