Skip to main content

iri_rs/
lib.rs

1//! This crates provides an implementation of
2//! [Uniform Resource Identifiers (URIs, aka URLs)][uri] and [Internationalized
3//! Resource Identifiers (IRIs)][iri] following [RFC 3987][uri-rfc] and [RFC
4//! 3986][iri-rfc] defined by the [Internet Engineering Task Force
5//! (IETF)][ietf] to uniquely identify objects across the web. IRIs are a
6//! superclass of URIs accepting international characters defined in the
7//! [Unicode][unicode] table.
8//!
9//! [uri]: <https://en.wikipedia.org/wiki/Uniform_Resource_Identifier>
10//! [uri-rfc]: <https://tools.ietf.org/html/rfc3986>
11//! [iri]: <https://en.wikipedia.org/wiki/Internationalized_resource_identifier>
12//! [iri-rfc]: <https://tools.ietf.org/html/rfc3987>
13//! [ietf]: <ietf.org>
14//! [unicode]: <https://en.wikipedia.org/wiki/Unicode>
15//!
16//! URI/IRIs are defined as a sequence of characters with distinguishable
17//! components: a scheme, an authority, a path, a query and a fragment.
18//!
19//! ```text
20//!     foo://example.com:8042/over/there?name=ferret#nose
21//!     \_/   \______________/\_________/ \_________/ \__/
22//!      |           |            |            |        |
23//!   scheme     authority       path        query   fragment
24//! ```
25//!
26//! This crate provides types to represent borrowed and owned URIs and IRIs
27//! (`Uri`, `Iri`, `UriBuf`, `IriBuf`), borrowed and owned URIs and IRIs
28//! references (`UriRef`, `IriRef`, `UriRefBuf`, `IriRefBuf`) and similar
29//! types for every part of an URI/IRI. Theses allows the easy access and
30//! manipulation of every components.
31//! It features:
32//!   - borrowed and owned URI/IRIs and URI/IRI-reference;
33//!   - mutable URI/IRI buffers (in-place);
34//!   - path normalization;
35//!   - comparison modulo normalization;
36//!   - URI/IRI-reference resolution;
37//!   - static URI/IRI parsing using the `uri`/`iri` macros (provided by
38//!     enabling the `macros` feature).
39//!   - `serde` support (by enabling the `serde` feature).
40//!   - data URL support (by enabling the `data` feature).
41//!
42//! ## Basic usage
43//!
44//! You can parse IRI strings by wrapping an `Iri` instance around a `str` slice.
45//! Note that no memory allocation occurs using `Iri`, it only borrows the input data.
46//! Access to each component is done in constant time.
47//!
48//! ```rust
49//! use iri_rs::Iri;
50//!
51//! let iri = Iri::parse("https://www.rust-lang.org/foo/bar?query#frag").unwrap();
52//!
53//! assert_eq!(iri.scheme(), "https");
54//! assert_eq!(iri.authority(), Some("www.rust-lang.org"));
55//! assert_eq!(iri.path(), "/foo/bar");
56//! assert_eq!(iri.query(), Some("query"));
57//! assert_eq!(iri.fragment(), Some("frag"));
58//! ```
59//!
60//! IRIs can be created and modified using the `IriBuf` type.
61//! With this type, the IRI is held in a single buffer,
62//! modified in-place to reduce memory allocation and optimize memory accesses.
63//! This also allows the conversion from `IriBuf` into `Iri`.
64//!
65//! ```rust
66//! use iri_rs::IriBuf;
67//!
68//! let mut iri = IriBuf::new("https://www.rust-lang.org").unwrap();
69//! iri.set_authority(Some("www.rust-lang.org:40")).unwrap();
70//! iri.set_path("/foo/bar").unwrap();
71//! iri.set_query(Some("query")).unwrap();
72//! iri.set_fragment(Some("fragment")).unwrap();
73//!
74//! assert_eq!(iri, "https://www.rust-lang.org:40/foo/bar?query#fragment");
75//! ```
76//!
77//! The `try_into` method is used to ensure that each string is syntactically correct with regard to its corresponding component (for instance, it is not possible to replace `"query"` with `"query?"` since `?` is not a valid query character).
78//!
79//! ## Detailed Usage
80//!
81//! ### Path manipulation
82//!
83//! The IRI path is accessed through the `path` or `path_mut` methods.
84//! It is possible to access the segments of a path using the iterator returned by the `segments` method.
85//!
86//! ```rust
87//! use iri_rs::Iri;
88//! let iri = Iri::parse("https://www.rust-lang.org/foo/bar?query#frag").unwrap();
89//! for segment in iri.path_segments() {
90//!     println!("{}", segment);
91//! }
92//! ```
93//!
94//! One can use the `normalized_segments` method to iterate over the normalized
95//! version of the path where dot segments (`.` and `..`) are removed.
96//! In addition, it is possible to push or pop segments to a path using the
97//! corresponding methods:
98//! ```rust
99//! use iri_rs::IriBuf;
100//! let mut iri = IriBuf::new("https://rust-lang.org/a/c").unwrap();
101//! iri.set_path("/a/b/c/").unwrap();
102//! assert_eq!(iri.path(), "/a/b/c/");
103//! ```
104//!
105//! ### IRI references
106//!
107//! This crate provides the two types `IriRef` and `IriRefBuf` to represent
108//! IRI references. An IRI reference is either an IRI or a relative IRI.
109//! Contrarily to regular IRIs, relative IRI references may have no scheme.
110//!
111//! ```rust
112//! use iri_rs::{IriRef, IriRefBuf};
113//! let mut iri_ref = IriRefBuf::default(); // empty reference.
114//! iri_ref.set_scheme(Some("https")).unwrap();
115//! iri_ref.set_authority(Some("example.com")).unwrap();
116//! assert!(iri_ref.as_iri().is_some());
117//! ```
118//!
119//! Given a base IRI, references can be resolved into a regular IRI using the
120//! [Reference Resolution Algorithm](https://tools.ietf.org/html/rfc3986#section-5)
121//! defined in [RFC 3986](https://tools.ietf.org/html/rfc3986).
122//! This crate provides a *strict* implementation of this algorithm.
123//!
124//! ```rust
125//! use iri_rs::{Iri, IriRefBuf};
126//! let base_iri = Iri::parse("http://a/b/c/d;p?q").unwrap();
127//! let mut iri_ref = IriRefBuf::new("g;x=1/../y").unwrap();
128//!
129//! let resolved = iri_ref.resolved(&base_iri).unwrap();
130//! assert_eq!(resolved, "http://a/b/c/y");
131//!
132//! iri_ref.resolve(&base_iri).unwrap();
133//! assert_eq!(iri_ref, "http://a/b/c/y");
134//! ```
135//!
136//! This crate implements
137//! [Errata 4547](https://www.rfc-editor.org/errata/eid4547) about the
138//! abnormal use of dot segments in relative paths.
139//! This means that for instance, the path `a/b/../../../` is normalized into
140//! `../`.
141//!
142//! ### IRI comparison
143//!
144//! Here are the features of the IRI comparison method implemented in this crate.
145//!
146//! #### Protocol agnostic
147//!
148//! This implementation does not know anything about existing protocols.
149//! For instance, even if the
150//! [HTTP protocol](https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol)
151//! defines `80` as the default port,
152//! the two IRIs `http://example.org` and `http://example.org:80` are **not** equivalent.
153//!
154//! #### Every `/` counts
155//!
156//! The path `/foo/bar` is **not** equivalent to `/foo/bar/`.
157//!
158//! #### Path normalization
159//!
160//! Paths are normalized during comparison by removing dot segments (`.` and `..`).
161//! This means for instance that the paths `a/b/c` and `a/../a/./b/../b/c` **are**
162//! equivalent.
163//! Note however that this crate implements
164//! [Errata 4547](https://www.rfc-editor.org/errata/eid4547) about the
165//! abnormal use of dot segments in relative paths.
166//! This means that for instance, the IRI `http:a/b/../../../` is equivalent to
167//! `http:../` and **not** `http:`.
168//!
169//! #### Percent-encoded characters
170//!
171//! Thanks to the [`pct-str` crate](https://crates.io/crates/pct-str),
172//! percent encoded characters are correctly handled.
173//! The two IRIs `http://example.org` and `http://exa%6dple.org` **are** equivalent.
174pub use iri_rs_core::*;
175
176#[doc(hidden)]
177pub use iri_rs_core as __private;
178
179#[cfg(feature = "static")]
180pub use iri_rs_static::*;
181
182#[cfg(feature = "enum")]
183pub use iri_rs_enum::*;