Skip to main content

sphinx_inv/
lib.rs

1//! A library to parse (and maybe one day write) Sphinx inventory files
2//! for referencing other documentation pages that use Sphinx.
3//!
4//! In contraty to Sphinx itself this library parses the data using a
5//! combinator parser, instead of a regex, which has better performance
6//! and better error reporting.
7//!
8//! This library was originally made for use in [ `snakedown` ]( https://crates.io/crates/snakedown )
9//! but an effort has been made to make it more generally useful.
10//!
11//! ### Disclaimer
12//! The Sphinx inventory format doesn't have a formal specification.
13//! What follows are just the rules that we (and others) have
14//! inferred from files we've seen in the wild. We try to be as correct as possible.
15//! That said, we can't be guaranteed to be correct. If you find any errors, or have a valid file we
16//! can't parse please open an issue!
17//!
18//! Currently only v2 is supported.
19//!
20//! ## Usage
21//!
22//! The main entry points of this create are the the [`InventoryHeader`] and [`SphinxReference`] data
23//! structs and the [`SphinxInventoryReader`] and [`SphinxInventoryWriter`]
24//! structs to handle with them.
25//!
26//! The [`SphinxInventoryReader`] and [`SphinxInventoryWriter`] can work with any struct that
27//! immplements [`std::io::Read`] and [`std::io::Write`] respectively. These are internally buffered
28//! so you do not have to wrap them yourself.
29//!
30//! When interacting with real `objects.inv` files in the wild you will most likely use the base
31//! reader and writer struct, but both also have a `PlainText` variant. The only difference is that
32//! the plain text versions don't encode/decode the data in zlib like the files do. This is mostly
33//! useful for debugging/testing. In the following examples we will use the plain text versions and
34//! the [`std::io::Cursor`] to make it easier to display the results, but the code should work
35//! basically unchanged by switching to a [`std::fs::File`] and the base readers and writers.
36//!
37//! ## Examples
38//!
39//!
40//! ```
41//! # use sphinx_inv::*;
42//! # use std::fs::File;
43//! # use std::io::{Read, Write, Cursor};
44//! # use pretty_assertions::assert_eq;
45//! #
46//! let header = InventoryHeader::new("Sphinx Inv", "0.2.0");
47//! let join_reference = SphinxReference::new(
48//!     "str.join".to_string(),
49//!     SphinxType::Python(PyRole::Method),
50//!     None,
51//!     "library/stdtypes.html#$".to_string(),
52//!     None);
53//! let lower_reference = SphinxReference::new(
54//!     "str.lower".to_string(),
55//!     SphinxType::Python(PyRole::Method),
56//!     None,
57//!     "library/stdtypes.html#$".to_string(),
58//!     None);
59//!
60//! let mut buffer = Vec::new();
61//!
62//! let mut cursor = Cursor::new(buffer);
63//! // the capacity is just to preallocate the internal buffer, it can be anything
64//! let mut writer = PlainTextSphinxInventoryWriter::from_header(&header, 2);
65//!
66//!
67//! // add the references to the writer
68//! writer.add_reference(&join_reference);
69//! writer.add_reference(&lower_reference);
70//!
71//! // add_reference on it's own only adds it to the internal buffer
72//! // nothing actually happens until you call [`SphinxInventoryWriter::finalize`]
73//! writer.finalize(&mut cursor).unwrap();
74//!
75//! let written = String::from_utf8(cursor.into_inner()).unwrap();
76//!
77//! assert_eq!(&written, "# Sphinx inventory version 2
78//! ## Project: Sphinx Inv
79//! ## Version: 0.2.0
80//! ## The remainder of this file is compressed using zlib.
81//! str.join py:method 1 library/stdtypes.html#$ -
82//! str.lower py:method 1 library/stdtypes.html#$ -
83//! ");
84//!
85//! let mut cursor = Cursor::new( written);
86//!
87//! let mut reader = PlainTextSphinxInventoryReader::from_reader(cursor).unwrap();
88//!
89//! assert_eq!(&header, reader.header());
90//!
91//! assert_eq!(reader.next().unwrap().unwrap(), join_reference);
92//! assert_eq!(reader.next().unwrap().unwrap(), lower_reference);
93//!
94//! ```
95//!
96//!
97//! ## Format Description
98//!
99//! As noted by Skinn et al. currently, a inventory file (in the v2 format) has 2 parts:
100//! the header and the body.
101//!
102//! ### Header description
103//!
104//! The header needs to be of the following format:
105//! ```txt
106//! # Sphinx inventory version 2
107//! # Project: <project name>
108//! # Version: <full version number>
109//! # The remainder of this file is compressed using zlib.
110//! ```
111//!
112//! #### Caveats:
113//! 1. The first line has to match exactly
114//! 2. version number should not contain a leading `v`
115//! 3. currently zlib is the only compression method that Sphinx supports.
116//! 4. Though it is not specified, it is expected that the text mentioned above is in ascii.
117//!    The project name can contain unicode, but the text in the example must match exactly[^*].
118//! 5. While Sphinx itself allows for userdefinable domains and roles, this is not possible for this
119//!    library due to being complied. However we have made an attempt to include as many domains and
120//!    roles we found out in the wild. If you are missing any, please submit a feature request or pull
121//!    request to add it!
122//!
123//! For more indepth explanation of the format, please see
124//! [spobjinv](https://sphobjinv.readthedocs.io/en/stable/syntax.html)
125//!
126//! [^*]: technically it doesn't as long as the byte offsets are the same since the Sphinx
127//! implementation just skips a known amount of bytes, but this is a impl detail so
128//! we recommend that the format is still followed
129//!
130//!
131//!### Body format
132//!
133//! The remaining body of the file after the header must be compressed with zlib.
134//! In the decompressed data each line should have the following format:
135//!
136//! ```txt
137//! {name} {domain}:{role} {priority} {uri} {dispname}
138//! ```
139//!
140//! Specifically it must match this regex:
141//! `(.+?)\s+(\S+)\s+(-?\d+)\s+?(\S*)\s+(.*)`
142//!
143//! For example:
144//! ```txt
145//! str.join py:method 1 library/stdtypes.html#$ -
146//! ```
147//!
148
149mod error;
150mod header;
151mod priority;
152mod readers;
153mod reference;
154mod roles;
155mod writers;
156
157/// The main error type returned by this crate
158pub use error::SphinxInvError;
159
160/// Error type when parsing either the header or a record fails.
161pub use error::SphinxParseError;
162
163/// Error type when there is not enough input from the underlying reader
164/// to properly parse the header
165pub use error::MissingHeaderComponent;
166
167/// Struct for handling the metadata of an inventory such as project name and version
168pub use header::InventoryHeader;
169
170/// The main entrypoint to this crate, used to read and parse sphinx reference data
171pub use readers::SphinxInventoryReader;
172
173/// plaintext version of [`SphinxInventoryReader`] mainly used for testing and demoing
174pub use readers::PlainTextSphinxInventoryReader;
175
176/// The main data struct of this crate with the necessary information to link to external
177pub use reference::SphinxReference;
178
179/// The main entrypoint to this crate, used to write and format sphinx reference data
180pub use writers::SphinxInventoryWriter;
181
182/// plaintext version of [`SphinxInventoryWriter`] mainly used for testing and demoing
183pub use writers::PlainTextSphinxInventoryWriter;
184
185/// type used to parse `{domain}:{roles}` information provided by Sphinx used to disembguate
186/// between object types and names between different languages
187pub use roles::SphinxType;
188
189pub use roles::{CRole, CmakeRole, CppRole, JsRole, MathRole, PyRole, RstRole, SipRole, StdRole};