Expand description
§bom-strip
Strip UTF-8/16/32 BOMs and stray U+FEFF code points from text.
A leading byte order mark breaks serde_json::from_str, hash-based
deduplication, and config parsers that don’t allow leading
whitespace. This crate gives you four small functions:
strip_str— strip a leading U+FEFF from a&str.strip_all— strip every U+FEFF in the input, not just leading.strip_bytes— strip a leading UTF-8 / UTF-16 LE/BE / UTF-32 LE/BE BOM from a&[u8].detect_bom— identify which BOM (if any) leads&[u8].
§Example
use bom_strip::{strip_str, strip_bytes, detect_bom, Bom};
assert_eq!(strip_str("\u{FEFF}hello"), "hello");
assert_eq!(strip_bytes(&[0xEF, 0xBB, 0xBF, b'h', b'i']), &[b'h', b'i']);
assert_eq!(detect_bom(&[0xFF, 0xFE, b'a', 0]), Some(Bom::Utf16Le));Enums§
- Bom
- Identified BOM kind.
Functions§
- detect_
bom - Detect which BOM (if any) leads
b. - strip_
all - Strip every U+FEFF (BOM and zero-width no-break-space) in
s. - strip_
bytes - Strip a leading BOM from
b. Returns the input unchanged if none. - strip_
str - Strip a leading U+FEFF from
s.