Expand description
This crates provides an implementation of Uniform Resource Identifiers (URIs, aka URLs) and Internationalized Resource Identifiers (IRIs) following RFC 3987 and RFC 3986 defined by the Internet Engineering Task Force (IETF) to uniquely identify objects across the web. IRIs are a superclass of URIs accepting international characters defined in the Unicode table.
URI/IRIs are defined as a sequence of characters with distinguishable components: a scheme, an authority, a path, a query and a fragment.
foo://example.com:8042/over/there?name=ferret#nose
\_/ \______________/\_________/ \_________/ \__/
| | | | |
scheme authority path query fragmentThis crate provides types to represent borrowed and owned URIs and IRIs
(Uri, Iri, UriBuf, IriBuf), borrowed and owned URIs and IRIs
references (UriRef, IriRef, UriRefBuf, IriRefBuf) and similar
types for every part of an URI/IRI. Theses allows the easy access and
manipulation of every components.
It features:
- borrowed and owned URI/IRIs and URI/IRI-reference;
- mutable URI/IRI buffers (in-place);
- path normalization;
- comparison modulo normalization;
- URI/IRI-reference resolution;
- static URI/IRI parsing using the
uri!/iri!macros. serdesupport (by enabling theserdefeature).no_stdsupport (by disabling the defaultstdfeature).
§Basic usage
You can parse an IRI string slice by simply calling Iri::new.
No memory allocation occurs using this function, it only borrows the input
data, and validates it. Access to each component is done in linear time.
use iref::Iri;
let iri = Iri::new("https://www.rust-lang.org/foo/bar?query#frag")?;
println!("scheme: {}", iri.scheme());
println!("authority: {}", iri.authority().unwrap());
println!("path: {}", iri.path());
println!("query: {}", iri.query().unwrap());
println!("fragment: {}", iri.fragment().unwrap());IRIs can be created and modified using the IriBuf type.
With this type, the IRI is held in a single buffer, modified in-place to
reduce memory allocations and optimize memory accesses.
This also allows the conversion from IriBuf into Iri.
use iref::IriBuf;
let mut iri = IriBuf::new("https://www.rust-lang.org".to_string())?;
iri.authority_mut().unwrap().set_port(Some("40".try_into()?));
iri.set_path("/foo".try_into()?);
iri.path_mut().push("bar".try_into()?);
iri.set_query(Some("query".try_into()?));
iri.set_fragment(Some("fragment".try_into()?));
assert_eq!(iri, "https://www.rust-lang.org:40/foo/bar?query#fragment");The try_into method is used to ensure that each string is syntactically
correct (for instance, it is not possible to replace "query" with
"query?" since ? is not a valid query character).
§Detailed Usage
§Path manipulation
The IRI path is accessed through the path or path_mut methods.
It is possible to access the segments of a path using the iterator returned
by the segments method.
for segment in iri.path().segments() {
println!("{}", segment);
}One can use the normalized_segments method to iterate over the normalized
version of the path where dot segments (. and ..) are removed.
In addition, it is possible to push or pop segments to a path using the
corresponding methods:
let mut iri = IriBuf::new("https://rust-lang.org/a/c".to_string())?;
let mut path = iri.path_mut();
path.pop();
path.push("b".try_into()?);
path.push("c".try_into()?);
path.push("".try_into()?); // the empty segment is valid.
assert_eq!(iri.path(), "/a/b/c/");§IRI references
This crate provides the two types IriRef and IriRefBuf to represent
IRI references. An IRI reference is either an IRI or a relative IRI.
Contrarily to regular IRIs, relative IRI references may have no scheme.
let mut iri_ref = IriRefBuf::default(); // an IRI reference can be empty.
// An IRI reference with a scheme is a valid IRI.
iri_ref.set_scheme(Some("https".try_into()?));
let iri: &Iri = iri_ref.as_iri().unwrap();
// An IRI can be safely converted into an IRI reference.
let iri_ref: &IriRef = iri.into();Given a base IRI, references can be resolved into a regular IRI using the Reference Resolution Algorithm defined in RFC 3986. This crate provides a strict implementation of this algorithm.
let base_iri = Iri::new("http://a/b/c/d;p?q")?;
let mut iri_ref = IriRefBuf::new("g;x=1/../y".to_string())?;
// non mutating resolution.
assert_eq!(iri_ref.resolved(base_iri), "http://a/b/c/y");
// in-place resolution.
iri_ref.resolve(base_iri);
assert_eq!(iri_ref, "http://a/b/c/y");This crate implements
Errata 4547 about the
abnormal use of dot segments in relative paths.
This means that for instance, the path a/b/../../../ is normalized into
../.
§IRI comparison
Here are the features of the IRI comparison method implemented in this crate.
§Protocol agnostic
This implementation does not know anything about existing protocols.
For instance, even if the
HTTP protocol
defines 80 as the default port,
the two IRIs http://example.org and http://example.org:80 are not equivalent.
§Every / counts
The path /foo/bar is not equivalent to /foo/bar/.
§Path normalization
Paths are normalized during comparison by removing dot segments (. and ..).
This means for instance that the paths a/b/c and a/../a/./b/../b/c are
equivalent.
Note however that this crate implements
Errata 4547 about the
abnormal use of dot segments in relative paths.
This means that for instance, the IRI http:a/b/../../../ is equivalent to
http:../ and not http:.
§Percent-encoded characters
Thanks to the pct-str crate,
percent encoded characters are correctly handled.
The two IRIs http://example.org and http://exa%6dple.org are
equivalent.
Re-exports§
pub use iri::InvalidIri;pub use iri::Iri;pub use iri::IriError;pub use iri::IriRef;pub use uri::InvalidUri;pub use uri::Uri;pub use uri::UriError;pub use uri::UriRef;pub use iri::IriBuf;pub use iri::IriRefBuf;pub use uri::UriBuf;pub use uri::UriRefBuf;
Modules§
Macros§
- fragment
- Parses an URI
Fragmentat compile time. - host
- Parses a URI authority
Hostat compile time. - ifragment
- Parses an IRI
Fragmentat compile time. - ihost
- Parses a IRI authority
Hostat compile time. - ipath
- Parses a IRI
Pathat compile time. - iquery
- Parses an IRI
Queryat compile time. - iri
- Parses an
Iriat compile time. - iri_ref
- Parses an
IriRefat compile time. - isegment
- Parses a IRI path
Segmentat compile time. - iuser_
info - Parses a IRI authority
UserInfoat compile time. - path
- Parses a URI
Pathat compile time. - port
- Parses a URI/IRI authority
Portat compile time. - query
- Parses an URI
Queryat compile time. - scheme
- Parses an IRI/IRI
Schemeat compile time. - segment
- Parses a URI path
Segmentat compile time. - uri
- Parses an
Uriat compile time. - uri_ref
- Parses an
UriRefat compile time. - user_
info - Parses a URI authority
UserInfoat compile time.
Structs§
- Invalid
Port - Invalid port error.
- Invalid
Scheme - Invalid scheme error.
- Path
Context - Port
- URI/IRI authority port.
- PortBuf
- Owned port.
- Scheme
- URI or IRI scheme.
- Scheme
Buf - Owned scheme.