Skip to main content

Crate zero_width_strip

Crate zero_width_strip 

Source
Expand description

§zero-width-strip

Strip zero-width and bidi-control Unicode characters from text.

Zero-width characters (U+200B–U+200F, U+2060, U+FEFF, etc.) and bidi overrides (U+202A–U+202E) are invisible in most renderers but are preserved by most tokenizers, which makes them a clean payload channel for prompt-injection attacks (“invisible instructions” hidden inside otherwise plain text).

This crate strips them.

§Example

use zero_width_strip::{strip, has_invisible};
let dirty = "hello\u{200B}\u{202E}world";
assert!(has_invisible(dirty));
assert_eq!(strip(dirty), "helloworld");

Functions§

has_invisible
True when the input contains any zero-width or bidi-override char.
strip
Return a copy of s with every zero-width / bidi-override char removed.
strip_into
Strip into a caller-provided buffer (avoids an allocation).