PublicSuffix2
A native Rust library for parsing and using Mozilla's Public Suffix List.
This library is a fork of rushmorem/publicsuffix and aims to reach feature parity with the popular Python library python-publicsuffix2
The Public Suffix List is a collection of all TLDs (Top-Level Domains) and other domains under which Internet users can directly register names. This library allows you to determine the "public suffix" part of a domain name.
Features
- Find the Public Suffix (TLD/eTLD): Extract the effective Top-Level Domain from any hostname.
- Find the Second-Level Domain (SLD/eTLD + 1): Extract the main, registrable part of a domain.
- Configurable Normalization: Control over lowercasing, handling of trailing dots, and IDNA (Punycode) conversion.
- ICANN and Private Rules: Filter matches to include only ICANN-managed TLDs or also include privately-managed domains (e.g.,
github.io). - Wildcard and Exception Rule Support: Correctly handles complex rules like
*.ckand!www.ck. - IDN and Punycode: Works seamlessly with both Unicode (e.g.,
食狮.中国) and Punycode (xn--fiqs8s.xn--fiq228c) domain names. - High Performance: Uses a trie data structure for fast lookups.
Installation
Add this to your Cargo.toml:
[]
= "0.5.2"
To fetch the list from a URL, enable the fetch feature:
[]
= { = "0.5.2", = ["fetch"] }
Usage
Getting Started
The easiest way to use the library is to create a List using List::default(). This uses a built-in copy of the Public Suffix List, so no file loading is required.
use ;
// Create a list from the built-in PSL data.
// It's recommended to do this once and reuse the `List` object.
let list = default;
let domain = "www.example.co.uk";
// Get the public suffix, also known as the TLD or eTLD.
let tld = list.tld;
assert_eq!;
// Get the registrable domain (the part you can register).
let sld = list.sld;
assert_eq!;
// `tld` and `sld` also work on hostnames that are already a public suffix.
let domain2 = "co.uk";
let tld2 = list.tld;
assert_eq!;
let sld2 = list.sld;
assert_eq!;
Splitting a Domain
The split method deconstructs a domain into all its parts.
use ;
let list = default;
let parts = list.split.unwrap;
assert_eq!;
assert_eq!;
assert_eq!;
assert_eq!;
Loading a Custom List
If you need to use a custom or updated Public Suffix List, you can create a List instance from a string, file, or URL.
use ;
Advanced Options
You can customize matching behavior using MatchOpts and Normalizer.
use ;
let list = default;
// --- Example 1: Handling trailing dots ---
let norm_strip_dot = Normalizer ;
let opts_strip_dot = MatchOpts ;
let sld1 = list.sld;
assert_eq!;
// --- Example 2: Filtering for ICANN rules only ---
// By default, private domains like `blogspot.com` are treated as TLDs.
let sld_default = list.sld;
assert_eq!;
// You can filter to only use ICANN section rules.
let opts_icann_only = MatchOpts ;
let sld_icann = list.sld;
assert_eq!;
// --- Example 3: Handling Internationalized Domain Names (IDN) ---
// The default normalizer converts to ASCII Punycode.
let sld_punycode = list.sld;
assert_eq!;
// You can disable IDNA conversion to keep Unicode characters.
let norm_no_idna = Normalizer ;
let opts_no_idna = MatchOpts ;
let sld_unicode = list.sld;
assert_eq!;
License
This project is licensed under the MIT License and Apache License. See the LICENSE or LICENSE-APACHE file for details.