Expand description
SXURL - Fixed-length, sliceable URL identifier system
“Sixerl” is a fixed-length, sliceable URL identifier system for efficient database storage and querying.
This crate provides a way to convert URLs into fixed 256-bit identifiers where each URL component occupies a fixed position in the resulting hex string.
§Features
- Fixed-length: All SXURL identifiers are exactly 256 bits (64 hex characters)
- Sliceable: Each URL component has a fixed position for substring filtering
- Deterministic: Same input always produces the same output
- Collision-resistant: Uses SHA-256 hashing for component fingerprinting
- Standards-compliant: Supports IDNA, Public Suffix List, and standard URL schemes
§Quick Start
use sxurl::{encode_url_to_hex, decode_hex, matches_component, split_url, parse_query};
// Encode a URL to SXURL
let sxurl_hex = encode_url_to_hex("https://docs.rs/sxurl")?;
println!("SXURL: {}", sxurl_hex); // 64 hex characters
// Decode and inspect
let decoded = decode_hex(&sxurl_hex)?;
println!("Scheme: {}", decoded.header.scheme);
// Filter by component
let is_rs_tld = matches_component(&sxurl_hex, "tld", "rs")?;
assert!(is_rs_tld);
// Parse URL into components
let parts = split_url("https://api.github.com/repos?page=1#readme")?;
println!("Domain: {}, Subdomain: {:?}", parts.domain, parts.subdomain);
println!("Anchor: {:?}", parts.anchor);
// Work with query parameters
let params = parse_query("https://example.com?foo=bar&page=2")?;
println!("Page: {:?}", params.get("page"));§SXURL Format
The 256-bit SXURL has this fixed layout:
| Component | Hex Range | Bits | Description |
|---|---|---|---|
| header | [0..3) | 12 | Version, scheme, flags |
| tld_hash | [3..7) | 16 | Top-level domain hash |
| domain_hash | [7..22) | 60 | Domain name hash |
| sub_hash | [22..30) | 32 | Subdomain hash |
| port | [30..34) | 16 | Port number |
| path_hash | [34..49) | 60 | Path hash |
| params_hash | [49..58) | 36 | Query parameters hash |
| frag_hash | [58..64) | 24 | Fragment hash |
§Supported URL Schemes
https(scheme code 0)http(scheme code 1)ftp(scheme code 2)
§Error Handling
All functions return Result<T, SxurlError>. Common error cases:
- Unsupported URL schemes (only https, http, ftp supported)
- Invalid hostnames or IP addresses
- Malformed URLs or SXURL hex strings
Re-exports§
pub use core::encode_url;pub use core::encode_url_to_hex;pub use core::encode_host_to_hex;pub use core::extract_host_from_full_sxurl;pub use core::extract_host_from_full_bytes;pub use core::SxurlEncoder;pub use core::decode_hex;pub use core::decode_bytes;pub use core::matches_component;pub use core::DecodedSxurl;pub use url::split_url;pub use url::split_domain;pub use url::get_path_segments;pub use url::get_filename;pub use url::parse_query;pub use url::get_query_value;pub use url::get_anchor;pub use url::strip_anchor;pub use url::join_url_path;pub use url::is_https;pub use url::has_query;pub use url::has_anchor;pub use url::UrlParts;pub use url::get_url_component;pub use url::strip_url_component;pub use error::SxurlError;pub use types::SxurlHeader;pub use types::UrlComponents;pub use types::UrlComponentType;pub use core::hash_component;pub use core::ComponentHasher;pub use core::extract_lower_bits;pub use url::normalize_url;pub use url::normalize_host;pub use url::validate_host;pub use url::split_host_with_psl;pub use url::extract_url_components;pub use core::pack_sxurl;pub use core::sxurl_to_hex;pub use core::hex_to_sxurl;pub use core::sxurl_contains;pub use core::sxurl_has_domain;pub use core::sxurl_has_subdomain;pub use core::sxurl_has_tld;