Expand description
SXURL - Fixed-length, sliceable URL identifier system
“Sixerl” is a fixed-length, sliceable URL identifier system for efficient database storage and querying.
This crate provides a way to convert URLs into fixed 256-bit identifiers where each URL component occupies a fixed position in the resulting hex string.
§Features
- Fixed-length: All SXURL identifiers are exactly 256 bits (64 hex characters)
- Sliceable: Each URL component has a fixed position for substring filtering
- Deterministic: Same input always produces the same output
- Collision-resistant: Uses SHA-256 hashing for component fingerprinting
- Standards-compliant: Supports IDNA, Public Suffix List, and standard URL schemes
§Quick Start
use sxurl::{encode_url_to_hex, decode_hex, matches_component, split_url, parse_query};
// Encode a URL to SXURL
let sxurl_hex = encode_url_to_hex("https://docs.rs/sxurl")?;
println!("SXURL: {}", sxurl_hex); // 64 hex characters
// Decode and inspect
let decoded = decode_hex(&sxurl_hex)?;
println!("Scheme: {}", decoded.header.scheme);
// Filter by component
let is_rs_tld = matches_component(&sxurl_hex, "tld", "rs")?;
assert!(is_rs_tld);
// Parse URL into components
let parts = split_url("https://api.github.com/repos?page=1#readme")?;
println!("Domain: {}, Subdomain: {:?}", parts.domain, parts.subdomain);
println!("Anchor: {:?}", parts.anchor);
// Work with query parameters
let params = parse_query("https://example.com?foo=bar&page=2")?;
println!("Page: {:?}", params.get("page"));
§SXURL Format
The 256-bit SXURL has this fixed layout:
Component | Hex Range | Bits | Description |
---|---|---|---|
header | [0..3) | 12 | Version, scheme, flags |
tld_hash | [3..7) | 16 | Top-level domain hash |
domain_hash | [7..22) | 60 | Domain name hash |
sub_hash | [22..30) | 32 | Subdomain hash |
port | [30..34) | 16 | Port number |
path_hash | [34..49) | 60 | Path hash |
params_hash | [49..58) | 36 | Query parameters hash |
frag_hash | [58..64) | 24 | Fragment hash |
§Supported URL Schemes
https
(scheme code 0)http
(scheme code 1)ftp
(scheme code 2)
§Error Handling
All functions return Result<T, SxurlError>
. Common error cases:
- Unsupported URL schemes (only https, http, ftp supported)
- Invalid hostnames or IP addresses
- Malformed URLs or SXURL hex strings
Re-exports§
pub use core::encode_url;
pub use core::encode_url_to_hex;
pub use core::encode_host_to_hex;
pub use core::extract_host_from_full_sxurl;
pub use core::extract_host_from_full_bytes;
pub use core::SxurlEncoder;
pub use core::decode_hex;
pub use core::decode_bytes;
pub use core::matches_component;
pub use core::DecodedSxurl;
pub use url::split_url;
pub use url::split_domain;
pub use url::get_path_segments;
pub use url::get_filename;
pub use url::parse_query;
pub use url::get_query_value;
pub use url::get_anchor;
pub use url::strip_anchor;
pub use url::join_url_path;
pub use url::is_https;
pub use url::has_query;
pub use url::has_anchor;
pub use url::UrlParts;
pub use url::get_url_component;
pub use url::strip_url_component;
pub use error::SxurlError;
pub use types::SxurlHeader;
pub use types::UrlComponents;
pub use types::UrlComponentType;
pub use core::hash_component;
pub use core::ComponentHasher;
pub use core::extract_lower_bits;
pub use url::normalize_url;
pub use url::normalize_host;
pub use url::validate_host;
pub use url::split_host_with_psl;
pub use url::extract_url_components;
pub use core::pack_sxurl;
pub use core::sxurl_to_hex;
pub use core::hex_to_sxurl;
pub use core::sxurl_contains;
pub use core::sxurl_has_domain;
pub use core::sxurl_has_subdomain;
pub use core::sxurl_has_tld;