tor_netdoc/parse.rs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
//! Parsing support for the network document meta-format
//!
//! The meta-format used by Tor network documents evolved over time
//! from a legacy line-oriented format. It's described more fully
//! in Tor's
//! [dir-spec.txt](https://spec.torproject.org/dir-spec).
//!
//! In brief, a network document is a sequence of [tokenize::Item]s.
//! Each Item starts with a [keyword::Keyword], takes a number of
//! _arguments_ on the same line, and is optionally followed by a
//! PEM-like base64-encoded _object_.
//!
//! Individual document types define further restrictions on the
//! Items. They may require Items with a particular keyword to have a
//! certain number of arguments, to have (or not have) a particular
//! kind of object, to appear a certain number of times, and so on.
//!
//! More complex documents can be divided into [parser::Section]s. A
//! Section might correspond to the header or footer of a longer
//! document, or to a single stanza in a longer document.
//!
//! To parse a document into a Section, the programmer defines a type
//! of keyword that the document will use, using the
//! `decl_keyword!` macro. The programmer then defines a
//! [parser::SectionRules] object, containing a [rules::TokenFmt]
//! describing the rules for each allowed keyword in the
//! section. Finally, the programmer uses a [tokenize::NetDocReader]
//! to tokenize the document, passing the stream of tokens to the
//! SectionRules object to validate and parse it into a Section.
//!
//! For multiple-section documents, this crate uses
//! [`Itertools::peeking_take_while`](itertools::Itertools::peeking_take_while)
//! (via a `[.pause_at`](NetDocReader::pause_at) convenience method)
//! and a [batching_split_before](crate::util::batching_split_before)
//! module which can split
//! a document item iterator into sections..
pub(crate) mod keyword;
pub(crate) mod parser;
pub(crate) mod rules;
pub(crate) mod tokenize;
#[macro_use]
pub(crate) mod macros;