mailrs-rfc2047
RFC 2047 MIME encoded-word decoder. Decodes =?charset?(B|Q)?text?=
header values (Subject, From display name, …) into UTF-8.
Supports the full WHATWG Encoding charset set via encoding_rs:
UTF-8, ISO-8859-, Windows-125, ISO-2022-JP, Shift_JIS, EUC-JP,
EUC-KR, Big5, GB18030, etc. Unknown charsets fall through to a
lossy UTF-8 pass.
Quickstart
use decode;
// ASCII inputs are returned borrowed — no allocation.
assert_eq!;
// Base64 encoded UTF-8.
assert_eq!;
// Q (quoted-printable) encoded.
assert_eq!;
// ISO-2022-JP (Japanese subject from real-world mail).
assert_eq!;
// Adjacent encoded-words collapse whitespace per RFC 2047 §6.2.
assert_eq!;
Pairing with mailrs-rfc5322
This crate is the typical companion of mailrs-rfc5322.
mailrs-rfc5322::Message::header() returns raw header bytes; pass those
bytes to mailrs_rfc2047::decode() to get the decoded text:
// (this example uses the `mailrs-rfc5322` crate as well; both ship
// independently. add both to your Cargo.toml to compile.)
use Message;
use decode;
What this crate is not
- Not an RFC 5322 parser. Use
mailrs-rfc5322for that. - Not a MIME body decoder (multipart, Content-Transfer-Encoding). This only decodes encoded-words in headers.
- Not a charset detector. The charset is taken verbatim from the
encoded-word token; if a message claims
=?UTF-8?Q?…?=and the bytes are actually Shift_JIS, you get garbage.
Performance
Measured numbers in BUDGETS.md. Reproduce via
cargo bench -p mailrs-rfc2047 --bench decode.
Headline: plain-ASCII inputs return as borrowed Cow::Borrowed(&str)
with zero allocations and constant time (just a forward scan for
=?). Encoded inputs go through one String allocation sized to the
input length.
License
Apache-2.0 OR MIT.