Expand description
Provides char classification for mail related grammar parts/charset, i.e. if a given char belongs characters valid in atext, ctext, dtext, token etc.
The Charset
enum is used to determine which set of character is used. To
check if a char
is in that set either use Charset::contains(&self, char)
or use ch.is(charset)
which is provided through the CharMatchExt
extension trait.
§Why Ws
is merged into CText
, DText
and QText
Any grammar part in which qtext
/ctext
/dtext
is used is in a form
which 1. repeats 2. prepends FWS
in the repeating part.
Which means any parser would have to parse for chars which are
qtext
/ctext
/dtext
OR ws
(and special handling if it hits another
character like "\r"
indicating the start of a soft line break etc.).
For example wrt. dtext
the grammar is ... *([FWS] dtext) [FWS] ...
which you can validate by parsing chars which are either dtext
or ws
and if you hit a "\r"
(which btw. is not in ws
) you make sure it’s
followed by "\n "
or "\n\t"
and then you continue with parsing.
§Alternative interface
All enum variants are re-exported under a module with the name of the rfc where
they are specified. E.g. Charset::CText
is also available as rfc5322::CText
.
§Example
extern crate mail_chars;
use mail_chars::{Charset, rfc5322, rfc2045, CharMatchExt};
fn main() {
assert!(Charset::AText.contains('d'));
assert!('d'.is(Charset::AText));
assert!('d'.is(rfc5322::AText));
// `rfc*::*` are just reexports grouped by RFC.
assert_eq!(Charset::Token, rfc2045::Token);
// If we want to test for more than on char set we can use lookup.
let res = Charset::lookup('.');
// Has the benefit that there is a is_ascii method
assert!(res.is_ascii());
assert!(res.is(rfc2045::Token));
assert!(res.is(rfc5322::CTextWs));
assert!(!res.is(rfc5322::AText));
}
Modules§
- rfc2045
- Re-export of all charsets (Charset::… variants) from rfc2045.
- rfc5322
- Re-export of all charsets (Charset::… variants) from rfc5322.
- rfc6838
- Re-export of all charsets (Charset::… variants) from rfc6838.
- rfc7230
- Re-export of all charsets (Charset::… variants) from rfc7320.
Structs§
- Lookup
Result - Represents the result of a lookup of a char.
Enums§
- Charset
- An enum for the charsets represented through an internal lookup table.