psl2 0.1.1

A modern alternative to the psl crate: Mozilla's Public Suffix List with built-in IDNA, fast builds, no_std support, and a clean API.
Documentation

psl2

crates.io docs.rs CI

A modern alternative to the psl crate for working with Mozilla's Public Suffix List.

psl2 tells you, reliably, whether a hostname is a registrable domain (one that can own cookies — e.g. example.co.uk) or a public suffix (an "extension" under which names are registered — e.g. co.uk).

// The public suffix ("effective TLD"):
assert_eq!(psl2::suffix("www.example.co.uk").as_deref(), Some("co.uk"));

// The registrable domain (eTLD + 1) — the cookie domain:
assert_eq!(
    psl2::registrable_domain("www.example.co.uk").as_deref(),
    Some("example.co.uk"),
);

// A bare public suffix has no registrable domain:
assert_eq!(psl2::registrable_domain("co.uk"), None);
assert!(psl2::is_public_suffix("co.uk"));

Internationalized domains work out of the box (inputs and outputs are normalized to ASCII/punycode):

assert_eq!(
    psl2::registrable_domain("食狮.公司.cn").as_deref(),
    Some("xn--85x722f.xn--55qx5d.cn"),
);

no_std / no_alloc

The crate is #![no_std], and its core lookup allocates nothing. Pass an already-lowercased ASCII/punycode hostname to lookup and read back borrowed slices — this works on bare-metal targets with no allocator:

let d = psl2::lookup("www.example.co.uk").unwrap();
assert_eq!(d.suffix(), "co.uk");
assert_eq!(d.registrable_domain(), Some("example.co.uk"));
assert_eq!(d.subdomain(), Some("www"));

The ergonomic analyze/suffix/… functions that normalize arbitrary or Unicode input live behind the (default) alloc / idna features.

Why another crate?

The existing psl and publicsuffix crates work, but have rough edges. psl2 is designed around a few principles:

  • Fast builds, fast lookups. The list is normalized to ASCII at publish time and embedded as a flattened reversed-label trie via include_bytes!, walked from the TLD inward with no allocation. There is no build.rs, no procedural-macro codegen, and no per-build list processing, so adding psl2 costs almost nothing in compile time. A typical lookup is a few tens of nanoseconds and does not slow down for deep hostnames.
  • no_std + no_alloc core, usable on embedded targets.
  • Built-in IDNA. You pass a &str hostname — Unicode or not — and psl2 normalizes it for you. No need to punycode-encode input yourself.
  • Clean, explicit API over &str, with ICANN / private / unknown classification exposed.
  • Always current. A scheduled GitHub Action republishes the crate whenever the upstream list changes. The bundled list version is available at runtime via psl2::psl_version().

API

Allocation-free core (always available):

Function Returns
lookup(host) -> Option<Domain> Borrowing analysis of a pre-normalized host
psl_version() -> &'static str The bundled PSL version

Domain exposes suffix(), registrable_domain(), subdomain(), is_public_suffix(), typ(), is_icann(), is_private(), is_known(), and as_str() — all returning borrowed &str.

Allocating convenience (requires alloc, on by default; normalizes any input):

Function Returns
analyze(host) -> Option<Info> Full analysis with owned normalized form
suffix(host) -> Option<String> The public suffix
registrable_domain(host) -> Option<String> The eTLD+1 (cookie domain)
subdomain(host) -> Option<String> The labels left of the registrable domain
is_public_suffix(host) -> bool Whether host is itself a public suffix

lookup is the zero-allocation path; prefer it on hot paths when your input is already lowercase ASCII/punycode.

Features

  • std (default) — currently just implies alloc.
  • alloc (default, via std) — the allocating convenience API above.
  • idna (default) — accept Unicode/IDN input via the idna crate (implies alloc). Without it, alloc input must be ASCII/punycode (lowercased for you); non-ASCII returns None.

With no features, only the allocation-free core (lookup, Domain, Type, psl_version) is compiled.

ICANN vs. PRIVATE

The list has two sections: ICANN (real registry suffixes) and PRIVATE (suffixes delegated by organizations, e.g. github.io, s3.amazonaws.com, blogspot.com). Both are honored by default, matching browser cookie behavior.

This means some names you might not expect are public suffixes — e.g. registrable_domain("blogspot.com") is None, and the registrable domain of foo.blogspot.com is foo.blogspot.com itself. This is intentional (diverging from the PSL would be worse); use is_icann() / is_private() on Domain or Info to tell the sections apart when it matters.

MSRV

Rust 1.86, set by the idna dependency tree (the icu crates). The allocation-free core (default-features = false) builds on much older Rust.

License

The psl2 source code is dual-licensed under either of

at your option.

The bundled Public Suffix List data (src/rules.txt, src/wildcards.txt, src/exceptions.txt, derived from public_suffix_list.dat) is © the Mozilla Foundation and distributed under the Mozilla Public License v2.0. It is included unmodified in substance, only re-encoded for efficient lookup.