psl2
A modern alternative to the psl crate for working with Mozilla's
Public Suffix List.
psl2 tells you, reliably, whether a hostname is a registrable domain (one
that can own cookies — e.g. example.co.uk) or a public suffix (an
"extension" under which names are registered — e.g. co.uk).
// The public suffix ("effective TLD"):
assert_eq!;
// The registrable domain (eTLD + 1) — the cookie domain:
assert_eq!;
// A bare public suffix has no registrable domain:
assert_eq!;
assert!;
Internationalized domains work out of the box (inputs and outputs are normalized to ASCII/punycode):
assert_eq!;
no_std / no_alloc
The crate is #![no_std], and its core lookup allocates nothing. Pass an
already-lowercased ASCII/punycode hostname to lookup and read back borrowed
slices — this works on bare-metal targets with no allocator:
let d = lookup.unwrap;
assert_eq!;
assert_eq!;
assert_eq!;
For embedded / allocator-free builds, turn the default features off:
[]
= { = "0.1", = false }
With no features, you still get a borrowing Domain object via lookup —
plus the compat module's domain/suffix
helpers over &[u8]. This is the same allocator-free, zero-copy capability the
psl crate offers by default (and a lighter compile, since psl2 embeds
the list as a data trie rather than ~100k lines of generated match code), so
the bare core is a near drop-in for psl on no_std targets. The only
alloc-gated items are the ones that must own or normalize: the Info struct
and the String-returning analyze/suffix/… convenience functions.
Why another crate?
The existing psl and publicsuffix crates work, but have rough edges. psl2
is designed around a few principles:
- Fast builds. The list is normalized to ASCII at publish time and
embedded as a flattened reversed-label trie via
include_bytes!, walked from the TLD inward with no allocation. There is nobuild.rs, no procedural-macro codegen, and no per-build list processing, so addingpsl2costs almost nothing in compile time — a few hundred lines of code plus an opaque data blob, versus the ~100k lines of generatedmatchcode thepslcrate makes every consumer compile. - Lookups in the ~20–50 ns range (with the default
fast-lookupfeature), allocation-free and stable across hostname depth. The embedded blobs are decoded into native node/edge arrays at compile time, so a lookup reads native fields with no byte assembly or endian conversion — performance is identical on little- and big-endian targets.fast-lookupadds a small per-edge prefix index (~42 KB of rodata) that roughly halves lookup time; disable it (default-features = false) to trade that speed back for the memory on size-constrained targets.psl2does not aim to beatpslon raw latency (pslcompiles the list straight to branch code), trading some speed for far cheaper compiles, built-in IDNA, and a clean API. (Seecompare-psl/for the head-to-head benchmark.) no_std+no_alloccore, usable on embedded targets.- Built-in IDNA. You pass a
&strhostname — Unicode or not — andpsl2normalizes it for you. No need to punycode-encode input yourself. - Clean, explicit API over
&str, with ICANN / private / unknown classification exposed. - Always current. A scheduled GitHub Action republishes the crate whenever
the upstream list changes. The bundled list version is available at runtime
via
psl2::psl_version().
API
Allocation-free core (always available):
| Function | Returns |
|---|---|
lookup(host) -> Option<Domain> |
Borrowing analysis of a pre-normalized host |
psl_version() -> &'static str |
The bundled PSL version |
Domain exposes
suffix(), registrable_domain(), subdomain(), is_public_suffix(),
typ(), is_icann(), is_private(), is_known(), and as_str() — all
returning borrowed &str.
Allocating convenience (requires alloc, on by default; normalizes any input):
| Function | Returns |
|---|---|
analyze(host) -> Option<Info> |
Full analysis with owned normalized form |
suffix(host) -> Option<String> |
The public suffix |
registrable_domain(host) -> Option<String> |
The eTLD+1 (cookie domain) |
subdomain(host) -> Option<String> |
The labels left of the registrable domain |
is_public_suffix(host) -> bool |
Whether host is itself a public suffix |
lookup is the zero-allocation path; prefer it on hot paths when your input is
already lowercase ASCII/punycode.
Features
std(default) — currently just impliesalloc.alloc(default, viastd) — the allocating convenience API above.idna(default) — accept Unicode/IDN input via theidnacrate (impliesalloc). Without it,allocinput must be ASCII/punycode (lowercased for you); non-ASCII returnsNone.fast-lookup(default) — embed a per-edge prefix index (~42 KB of rodata) that makes lookups ~2× faster. Disable it when binary size matters more than lookup speed; results are identical either way.
With no features, only the allocation-free core (lookup, Domain,
Type, psl_version) is compiled — in its compact, smaller-but-slower form
(add fast-lookup back if you want the speed without alloc/idna).
Migrating from the psl crate
The compat module mirrors the psl crate's shape (operates on &[u8],
returns a Domain that is the registrable domain with a nested Suffix):
let d = domain_str.unwrap;
assert_eq!;
assert_eq!;
assert!;
psl |
psl2::compat |
|---|---|
psl::domain_str(s) |
psl2::compat::domain_str(s) |
psl::suffix_str(s) |
psl2::compat::suffix_str(s) |
psl::domain(b) / psl::suffix(b) |
psl2::compat::domain(b) / suffix(b) |
Like psl, this path is allocation-free, borrows from its input, and is
case-sensitive (expects lowercased ASCII/punycode). For Unicode/auto-normalized
input, use the main analyze/registrable_domain API.
ICANN vs. PRIVATE
The list has two sections: ICANN (real registry suffixes) and PRIVATE
(suffixes delegated by organizations, e.g. github.io, s3.amazonaws.com,
blogspot.com). Both are honored by default, matching browser cookie behavior.
This means some names you might not expect are public suffixes — e.g.
registrable_domain("blogspot.com") is None, and the registrable domain of
foo.blogspot.com is foo.blogspot.com itself. This is intentional (diverging
from the PSL would be worse); use is_icann() / is_private() on Domain or
Info to tell the sections apart when it matters.
MSRV
Rust 1.86, set by the idna dependency tree (the icu crates). The
allocation-free core (default-features = false) builds on much older Rust.
License
The psl2 source code is dual-licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT license (LICENSE-MIT)
at your option.
The bundled Public Suffix List data (src/rules.txt, src/wildcards.txt,
src/exceptions.txt, derived from public_suffix_list.dat) is © the Mozilla
Foundation and distributed under the Mozilla Public License v2.0. It is
included unmodified in substance, only re-encoded for efficient lookup.