url-normalize
Normalize a URL — faithful Rust port of sindresorhus/normalize-url v9.0.0.
Useful when you need to display, store, deduplicate, sort, compare, etc., URLs.
Usage
use ;
// Basic usage with default options
let result = normalize_url.unwrap;
assert_eq!;
// Custom options
let opts = Options ;
let result = normalize_url.unwrap;
assert_eq!;
Add to your Cargo.toml:
[]
= "0.1"
What it does
- Lowercases the protocol and hostname
- Removes default ports (80 for HTTP, 443 for HTTPS)
- Resolves relative paths (
/../,/./) - Removes duplicate slashes in paths
- Removes
www.from the hostname - Removes trailing slashes
- Removes URL fragments (optional)
- Removes known tracking parameters (e.g.,
utm_*) - Sorts query parameters alphabetically
- Decodes unnecessarily encoded URI octets
- Encodes query values like
URLSearchParams - Handles international domain names (IDNA/Punycode)
- Handles data URLs
- Handles custom protocol schemes
Options
| Option | Type | Default | Description |
|---|---|---|---|
default_protocol |
Protocol |
Http |
Protocol to use for protocol-relative URLs |
custom_protocols |
Vec<String> |
[] |
Additional protocols to normalize |
normalize_protocol |
bool |
true |
Prepend default protocol to // URLs |
force_http |
bool |
false |
Convert HTTPS → HTTP |
force_https |
bool |
false |
Convert HTTP → HTTPS |
strip_authentication |
bool |
true |
Remove user:password@ |
strip_hash |
bool |
false |
Remove #fragment |
strip_protocol |
bool |
false |
Remove https:// prefix |
strip_text_fragment |
bool |
true |
Remove #:~:text= fragments |
strip_www |
bool |
true |
Remove www. from hostname |
remove_query_parameters |
RemoveQueryParameters |
UTM filter | Remove matching query params |
keep_query_parameters |
Option<Vec<QueryFilter>> |
None |
Keep only matching query params |
remove_trailing_slash |
bool |
true |
Remove trailing / from path |
remove_single_slash |
bool |
true |
Remove sole / path |
remove_directory_index |
RemoveDirectoryIndex |
None |
Remove index.html etc. |
remove_explicit_port |
bool |
false |
Remove all port numbers |
sort_query_parameters |
bool |
true |
Sort query params by key |
empty_query_value |
EmptyQueryValue |
Preserve |
How to handle ?key vs ?key= |
remove_path |
bool |
false |
Remove the entire URL path |
transform_path |
Option<Box<dyn Fn(...)>> |
None |
Custom path transformation |
Examples
Remove tracking parameters
use ;
let url = "https://example.com/page?utm_source=google&utm_medium=cpc&id=123";
let result = normalize_url.unwrap;
assert_eq!;
Remove all query parameters
use ;
let opts = Options ;
let result = normalize_url.unwrap;
assert_eq!;
Custom protocols
use ;
let opts = Options ;
let result = normalize_url.unwrap;
assert_eq!;
Directory index removal
use ;
let opts = Options ;
let result = normalize_url.unwrap;
assert_eq!;
Dependencies
Minimal by design:
idna— International domain name encoding (IDNA/Punycode)
Everything else (URL parsing, percent-encoding, pattern matching) is hand-rolled.
Testing
The implementation is verified against the actual JavaScript test suite from sindresorhus/normalize-url v9.0.0:
# Rust tests (175 tests)
# Run the actual JS test.js against our implementation
&& &&
Attribution
This is a Rust port of normalize-url by Sindre Sorhus, licensed under the MIT License.
License
MIT