# structured-email-address
RFC 5321/5322/6531 conformant email address parser, validator, and normalizer for Rust.
[](https://github.com/structured-world/structured-email-address/actions/workflows/ci.yml)
[](https://crates.io/crates/structured-email-address)
[](https://docs.rs/structured-email-address)
[](LICENSE)
## What makes this different?
Every Rust email crate stops at RFC validation. This one goes further:
| RFC 5322 grammar | Partial | Full | Full |
| RFC 6531 (UTF-8) | Yes | Yes | Yes |
| Subaddress/+tag extraction | - | - | **Yes** |
| Provider-aware dot-stripping | - | - | **Yes** |
| Configurable case folding | - | - | **Yes** |
| PSL domain validation | - | - | **Yes** |
| Anti-homoglyph detection | - | - | **Yes** |
| IDN domain Unicode accessor | - | - | **Yes** |
| Display name parsing | Yes | - | **Yes** |
| Configurable strictness | Partial | Partial | **Full** |
| Serde support | Yes | - | **Yes** |
| Zero dependencies* | Yes | nom | `idna` + 3 |
\* Dependencies: `idna`, `unicode-normalization`, `unicode-security`. Optional: `structured-public-domains`, `serde`.
## Quick Start
```rust
use structured_email_address::{EmailAddress, Config};
// Parse with defaults (RFC 5322 Standard mode)
let email: EmailAddress = "user+tag@example.com".parse()?;
assert_eq!(email.local_part(), "user+tag");
assert_eq!(email.tag(), Some("tag"));
assert_eq!(email.domain(), "example.com");
// International domains: IDNA roundtrip
let email: EmailAddress = "user@münchen.de".parse()?;
assert_eq!(email.domain(), "xn--mnchen-3ya.de");
assert_eq!(email.domain_unicode(), "münchen.de");
```
## Configured Parsing
```rust
use structured_email_address::{EmailAddress, Config};
let config = Config::builder()
.strip_subaddress() // user+tag → user
.dots_gmail_only() // a.l.i.c.e@gmail.com → alice@gmail.com
.lowercase_all() // USER → user
.check_confusables() // detect Cyrillic lookalikes
.domain_check_psl() // verify domain in Public Suffix List
.build();
let email = EmailAddress::parse_with("A.L.I.C.E+promo@Gmail.COM", &config)?;
assert_eq!(email.canonical(), "alice@gmail.com");
assert_eq!(email.tag(), Some("promo"));
assert!(email.is_freemail());
```
## Provider-Aware Normalization
Each known provider carries its own rule (dot handling, case folding, subaddress
separator, freemail flag). Enable `provider_aware()` to normalize a matched
address by its provider's rule instead of the global policies, and register your
own providers:
```rust
use structured_email_address::{Config, EmailAddress, ProviderRule};
let config = Config::builder()
.provider_aware() // matched provider's rule governs the address
.strip_subaddress()
.add_provider( // extend the built-in registry
ProviderRule::new(["mail.corp.example"])
.strip_dots(true)
.lowercase_local(true)
.subaddress_separator(Some('-')),
)
.build();
// Gmail's built-in rule strips dots + folds case even with no global policy set:
let g = EmailAddress::parse_with("A.Li.Ce+promo@Gmail.com", &config)?;
assert_eq!(g.canonical(), "alice@gmail.com");
// Custom provider with a '-' separator:
let c = EmailAddress::parse_with("John.Doe-tag@mail.corp.example", &config)?;
assert_eq!(c.local_part(), "johndoe");
assert_eq!(c.tag(), Some("tag"));
```
Built-in providers: Gmail/Googlemail (dot-stripping), Outlook, Yahoo, ProtonMail,
iCloud, Yandex, Mail.ru, and other common freemail domains. `is_freemail()`
consults the same registry regardless of `provider_aware`.
## Display Names
```rust
use structured_email_address::{EmailAddress, Config};
let config = Config::builder().allow_display_name().build();
let email = EmailAddress::parse_with("John Doe <user@example.com>", &config)?;
assert_eq!(email.display_name(), Some("John Doe"));
```
## Batch Parsing
Parse thousands of addresses in one call. Config is shared, results preserve input order:
```rust
use structured_email_address::{EmailAddress, Config};
let config = Config::builder().strip_subaddress().lowercase_all().build();
let results = EmailAddress::parse_batch(
&["alice@example.com", "invalid", "bob+tag@example.org"],
&config,
);
assert!(results[0].is_ok());
assert!(results[1].is_err());
assert!(results[2].is_ok());
```
For large lists (10K+), enable the `rayon` feature for parallel parsing:
```toml
structured-email-address = { version = "0.0.1", features = ["rayon"] }
```
```rust,ignore
let results = EmailAddress::parse_batch_par(&huge_list, &config);
```
### Batch Benchmarks (baseline)
100K emails (mix of valid + invalid), `strip_subaddress` + `dots_gmail_only` + `lowercase_all` config.
Apple M1 Pro, Rust 1.85, `cargo bench --all-features`.
| `parse_batch` (sequential) | 49.1 ms | ~2.0M emails/sec |
| `parse_batch_par` (rayon) | 9.6 ms | ~10.4M emails/sec |
Rayon gives ~5x speedup on this workload.
## Strictness Levels
| `Strict` | RFC 5321 (envelope) | SMTP validation, reject exotic addresses |
| `Standard` | RFC 5322 (header) | Default — full grammar, no obsolete forms |
| `Lax` | RFC 5322 + obs-* | Legacy system interop |
## Features
| `serde` | Yes | Serialize/deserialize as canonical string |
| `psl` | Yes | Domain validation against Public Suffix List |
| `rayon` | No | Parallel batch parsing via `parse_batch_par()` |
```toml
# Minimal (no serde, no PSL)
structured-email-address = { version = "0.0.1", default-features = false }
```
## Anti-Homoglyph Protection
Detects visually confusable email addresses using Unicode skeleton mapping:
```rust
use structured_email_address::confusable_skeleton;
// Cyrillic 'а' (U+0430) vs Latin 'a' (U+0061)
assert_eq!(
confusable_skeleton("аlice"), // Cyrillic а
confusable_skeleton("alice"), // Latin a
);
```
## Conformance
Validated against the [isEmail](https://github.com/dominicsayers/isemail) test
suite (v3.05, 164 edge cases), the same corpus used by `email-address-parser`.
All 164 cases pass: valid addresses (RFC 5321/5322 quoted strings, IPv4/IPv6
address literals, comments, folding whitespace, obsolete forms) are accepted at
the appropriate strictness level, while malformed inputs (bad IP literals,
over-length parts, bare control characters) are rejected. See
[`tests/conformance.rs`](tests/conformance.rs).
## Support the Project
<div align="center">

USDT (TRC-20): `TFDsezHa1cBkoeZT5q2T49Wp66K8t2DmdA`
</div>
## License
Apache License 2.0