Skip to main content

iri_string/
lib.rs

1//! String types for [RFC 3987 Internationalized Resource Identifiers (IRIs)][RFC 3987] and
2//! [RFC 3986 Uniform Resource Identifiers (URIs)][RFC 3986].
3//!
4//! Note that this crate does not have any extra knowledge about protocols.
5//! Comparisons between IRI strings by `PartialEq` and `Eq` is implemented as [simple string
6//! comparison](https://www.rfc-editor.org/rfc/rfc3986.html#section-6.2.1).
7//! You should implement by yourself or use another crate to use such extra knowledge to compare
8//! IRIs / URIs.
9//!
10//! # Capability
11//!
12//! This crate provides many features for IRIs / URIs.
13//!
14//! ## String types
15//!
16//! [`types` module][`types`] module provides various string types for IRIs and URIs.
17//! The borrowed string types are unsized slice types (such as `[u8]` and `str`)
18//! and not a sized struct, so they are highly interoperable with for example
19//! `Cow` and `Rc`. Conversions between `&str` and borrwed IRI string types are easy.
20//!
21//! ## Resolvers
22//!
23//! [`resolve` module][`resolve`] provides IRI / URI references resolver.
24//! However, you are recommended to use methods of string types such as
25//! [`RiReferenceStr::resolve_against()`] or [`RiRelativeStr::resolve_against()`]
26//! if you don't intend to resolve multiple IRIs against the same base.
27//!
28//! ## Validators
29//!
30//! Validator functions are provided from [`validate` module][`validate`].
31//!
32//! ## Percent encoding
33//!
34//! [`percent_encode` module][`percent_encode`] provides a converter to encode
35//! user-provided string into percent-encoded one (if syntax requires so).
36//!
37//! Functions in this module is intended for manual URI components manipulation.
38//! If you need to convert a Unicode IRI into ASCII-only URI, check
39//! `encode_to_uri` methods of IRI string types (such as
40//! [`IriStr::encode_to_uri`][`types::IriStr::encode_to_uri`]).
41//!
42//! ## IRI builder
43//!
44//! [`build` module][`build`] provides IRI builder.
45//!
46//! ## URI template (RFC 6570)
47//!
48//! [`template` module][`template`] provides an RFC 6570 URI Template processor.
49//!
50//! # Feature flags
51//!
52//! ## `std` and `alloc` support
53//!
54//! This crate supports `no_std` usage.
55//!
56//! * `alloc` feature:
57//!     + Std library or `alloc` crate is required.
58//!     + This feature enables types and functions which require memory allocation,
59//!       e.g. `types::IriString` and `types::IriRelativeStr::resolve_against()`.
60//! * `std` feature (**enabled by default**):
61//!     + Std library is required.
62//!     + This automatically enables `alloc` feature.
63//!     + The feature let the crate utilize std-specific stuff, such as `std::error::Error` trait.
64//! * With neither of them:
65//!     + The crate can be used in `no_std` environment.
66//!
67//! ## Other features
68//!
69//! * `serde`
70//!     + Enables serde support.
71//!     + Implements `Serailize` and `Deserialize` traits for IRI / URI types.
72//! * `memchr`
73//!     + Enables faster internal character search.
74//!
75//! # Rationale
76//!
77//! ## `foo:`, `foo:/`, `foo://`, `foo:///`, `foo:////`, ... are valid IRIs
78//!
79//! All of these are valid IRIs.
80//! (On the other hand, all of them are invalid as relative IRI reference, because they don't
81//! match `relative-part` rule, especially `path-noscheme`, as the first path component of the
82//! relative path contains a colon.)
83//!
84//! * `foo:`
85//!     + Decomposed to `<scheme="foo">:<path-empty="">`.
86//! * `foo:/`
87//!     + Decomposed to `<scheme="foo">:<path-absolute="/">`.
88//! * `foo://`
89//!     + Decomposed to `<scheme="foo">://<authority=""><path-absolute="">`.
90//! * `foo:///`
91//!     + Decomposed to `<scheme="foo">://<authority=""><path-absolute="/">`.
92//! * `foo:////`
93//!     + Decomposed to `<scheme="foo">://<authority=""><path-absolute="//">`.
94//! * `foo://///`
95//!     + Decomposed to `<scheme="foo">://<authority=""><path-absolute="///">`.
96//!
97//! RFC 3986 says that "if authority is absent, path cannot start with `//`".
98//!
99//! > When authority is present, the path must either be empty or begin with a slash ("/")
100//! > character. When authority is not present, the path cannot begin with two slash characters
101//! > ("//").
102//! >
103//! > --- [RFC 3986, section 3. Syntax Components](https://www.rfc-editor.org/rfc/rfc3986.html#section-3).
104//!
105//! > If a URI contains an authority component, then the path component must either be empty or
106//! > begin with a slash ("/") character. If a URI does not contain an authority component, then the
107//! > path cannot begin with two slash characters ("//").
108//! >
109//! > --- [RFC 3986, section 3.3. Path](https://www.rfc-editor.org/rfc/rfc3986.html#section-3.3)
110//!
111//! We should interpret them as "if `authority` rule is completely unused (i.e. does not match any
112//! strings **including empty string**), path cannot start with `//`".
113//! In other words, we should consider this as **explaining the ABNF of `hier-part` rule**
114//! (especially why it does not use `path` rule), but **not adding extra restriction to the rule
115//! written in ABNF**.
116//!
117//! This restriction is necessary to remove ambiguity in decomposition of some strings.
118//! For example, it is natural to decompose `foo://` to `<scheme="foo">:<path="//">` or
119//! `<scheme="foo">://<authority=""><path="">`.
120//! The restriction, **which is already encoded to the ABNF rule**, tells us to always decompose to
121//! the latter form, rather than the former one.
122//!
123//! Readers of the spec might be confused by "when authority is **present**" and "if a URI
124//! **contains** an authority component, which is unclear.
125//! However, based on the interpretation above, we should consider authority part with empty string
126//! as satisfying the condition "authority is **present**".
127//!
128//! ## IRI resolution can fail
129//!
130//! For some inputs, resulting string of IRI normalization and resolution can be syntactically
131//! correct but semantically wrong. In such cases, the normalizer and resolver provided by this
132//! crate do not silently "fix" the IRI by non-standard processing, but just
133//! fail by returning `Err(_)`.
134//!
135//! For details, see the documentation of [`normalize`] module.
136//!
137//! [RFC 3986]: https://www.rfc-editor.org/rfc/rfc3986.html
138//! [RFC 3987]: https://www.rfc-editor.org/rfc/rfc3987.html
139//! [`RiReferenceStr::resolve_against()`]: `types::RiReferenceStr::resolve_against`
140//! [`RiRelativeStr::resolve_against()`]: `types::RiRelativeStr::resolve_against`
141#![warn(missing_docs)]
142#![warn(unsafe_op_in_unsafe_fn)]
143#![warn(clippy::missing_docs_in_private_items)]
144#![warn(clippy::undocumented_unsafe_blocks)]
145#![cfg_attr(not(feature = "std"), no_std)]
146#![cfg_attr(docsrs, feature(doc_cfg))]
147
148#[cfg(feature = "alloc")]
149extern crate alloc;
150
151pub mod build;
152pub mod components;
153pub mod convert;
154pub mod format;
155pub mod mask_password;
156pub mod normalize;
157pub(crate) mod parser;
158pub mod percent_encode;
159pub(crate) mod raw;
160pub mod resolve;
161pub mod spec;
162pub mod template;
163pub mod types;
164pub mod validate;