scan_rules/scanner/
mod.rs

1/*
2Copyright ⓒ 2016 Daniel Keep.
3
4Licensed under the MIT license (see LICENSE or <http://opensource.org
5/licenses/MIT>) or the Apache License, Version 2.0 (see LICENSE of
6<http://www.apache.org/licenses/LICENSE-2.0>), at your option. All
7files in the project carrying such notice may not be copied, modified,
8or distributed except according to those terms.
9*/
10/*!
11This module defines various scanners that can be used to extract values from input text.
12
13## Kinds of Scanner
14
15Scanners can be classified as "static self scanners", "static abstract scanners", and "runtime abstract scanners".
16
17* "Static self scanners" are types which implement the `ScanFromStr` trait and output an instance of themselves.  For example, if you scan using the `i32` type, you get an `i32` result.  These are implemented for types which have an obvious "default" scanning syntax.
18
19  As a consequence of outputting an instance of themselves, they *also* automatically implement the `ScanSelfFromStr` trait.
20
21* "Static abstract scanners" are types which implement the `ScanFromStr` trait and output an instance of *some other* type.  For example, if you scan using the `Word` type, you get a `&str` or `String` result.  These are implemented for cases where different rules are desireable, such as scanning particular *subsets* of a type (see `Word`, `Number`, `NonSpace`), or non-default encodings (see `Binary`, `Octal`, `Hex`).
22
23* "Runtime abstract scanners" implement the `ScanStr` trait and serve the same overall function as static abstract scanners, except that the scanner *itself* must be constructed.  In other words, static scanners are types, runtime scanners are *values*.  This makes them a little less straightforward to use, but they are *significantly* more flexible.  They can be parameterised at runtime, to perform arbitrary manipulations of both the text input and scanned values (see `max_width`, `re_str`).
24
25## Bridging Between Static and Runtime Scanners
26
27A scanner of interest is `ScanA<Type>`.  This is a runtime scanner which takes a *static* scanner as a type parameter.  This allows you to use a static scanner in a context where a runtime scanner is needed.
28
29For example, these two bindings are equivalent in terms of behaviour:
30
31```ignore
32    // Scan a u32.
33    let _: u32
34    let _ <| scan_a::<u32>()
35```
36
37## Creating Runtime Scanners
38
39Runtime scanners are typically constructed using functions, rather than dealing with the implementing type itself.  For example, to get an instance of the `ExactWidth` runtime scanner, you would call either the `exact_width` or `exact_width_a` functions.
40
41The reason for two functions is that most runtime scanners accept a *second* runtime scanner for the purposes of chaining.  This allows several transformations to be applied outside-in.  For example, you can combine runtime scanners together like so:
42
43```ignore
44    // Scan a word of between 2 and 5 bytes.
45    let _ <| min_width(2, max_width(5, scan_a::<Word>()))
46```
47
48Functions ending in `_a` are a shorthand for the common case of wrapping a runtime scanner around a static scanner.  For example, the following two patterns are equivalent:
49
50```ignore
51    // Scan a u32 that has, at most, four digits.
52    let _ <| max_width(4, scan_a::<u32>())
53    let _ <| max_width_a::<u32>(4)
54```
55*/
56/*
57It is also where implementations for existing standard and external types are kept, though these do not appear in the documentation.
58*/
59pub use self::misc::{
60    Everything, HorSpace, Newline, NonSpace, Space,
61    Ident, Line, Number, Word, Wordish,
62    Inferred, KeyValuePair, QuotedString,
63    Binary, Octal, Hex,
64};
65
66#[doc(inline)] pub use self::runtime::{
67    exact_width, exact_width_a,
68    max_width, max_width_a,
69    min_width, min_width_a,
70    scan_a,
71};
72
73#[cfg(feature="regex")]
74#[doc(inline)]
75pub use self::runtime::{re, re_a, re_str};
76
77#[cfg(feature="nightly-pattern")]
78#[doc(inline)]
79pub use self::runtime::{until_pat, until_pat_a, until_pat_str};
80
81#[macro_use] mod macros;
82
83pub mod runtime;
84pub mod std;
85
86mod lang;
87mod misc;
88
89use ::ScanError;
90use ::input::ScanInput;
91
92/**
93This trait defines the interface to a type which can be scanned.
94
95The exact syntax scanned is entirely arbitrary, though there are some rules of thumb that implementations should *generally* stick to:
96
97* Do not ignore leading whitespace.
98* Do not eagerly consume trailing whitespace, unless it is legitimately part of the scanned syntax.
99
100In addition, if you are implementing scanning directly for the result type (*i.e.* `Output = Self`), prefer parsing *only* the result of the type's `Debug` implementation.  This ensures that there is a degree of round-tripping between `format!` and `scan!`.
101
102If a type has multiple legitimate parsing forms, consider defining those alternate forms on abstract scanner types (*i.e.* `Output != Self`) instead.
103
104See: [`ScanSelfFromStr`](trait.ScanSelfFromStr.html).
105*/
106pub trait ScanFromStr<'a>: Sized {
107    /**
108    The type that the implementation scans into.  This *does not* have to be the same as the implementing type, although it typically *will* be.
109
110    See: [`ScanSelfFromStr::scan_self_from`](trait.ScanSelfFromStr.html#method.scan_self_from).
111    */
112    type Output;
113
114    /**
115    Perform a scan on the given input.
116
117    Implementations must return *either* the scanned value, and the number of bytes consumed from the input, *or* a reason why scanning failed.
118    */
119    fn scan_from<I: ScanInput<'a>>(s: I) -> Result<(Self::Output, usize), ScanError>;
120
121    /**
122    Indicates whether or not the scanner wants its input to have leading "junk", such as whitespace, stripped.
123
124    The default implementation returns `true`, which is almost *always* the correct answer.  You should only implement this explicitly (and return `false`) if you are implementing a scanner for which leading whitespace is important.
125    */
126    fn wants_leading_junk_stripped() -> bool { true }
127}
128
129/**
130This is a convenience trait automatically implemented for all scanners which result in themselves (*i.e.* `ScanFromStr::Output = Self`).
131
132This exists to aid type inference.
133
134See: [`ScanFromStr`](trait.ScanFromStr.html).
135*/
136pub trait ScanSelfFromStr<'a>: ScanFromStr<'a, Output=Self> {
137    /**
138    Perform a scan on the given input.
139
140    See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from).
141    */
142    fn scan_self_from<I: ScanInput<'a>>(s: I) -> Result<(Self, usize), ScanError> {
143        Self::scan_from(s)
144    }
145}
146
147impl<'a, T> ScanSelfFromStr<'a> for T where T: ScanFromStr<'a, Output=T> {}
148
149/**
150This trait defines scanning a type from a binary representation.
151
152This should be implemented to match implementations of `std::fmt::Binary`.
153*/
154pub trait ScanFromBinary<'a>: Sized {
155    /**
156    Perform a scan on the given input.
157
158    See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from).
159    */
160    fn scan_from_binary<I: ScanInput<'a>>(s: I) -> Result<(Self, usize), ScanError>;
161}
162
163/**
164This trait defines scanning a type from an octal representation.
165
166This should be implemented to match implementations of `std::fmt::Octal`.
167*/
168pub trait ScanFromOctal<'a>: Sized {
169    /**
170    Perform a scan on the given input.
171
172    See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from).
173    */
174    fn scan_from_octal<I: ScanInput<'a>>(s: I) -> Result<(Self, usize), ScanError>;
175}
176
177/**
178This trait defines scanning a type from a hexadecimal representation.
179
180This should be implemented to match implementations of `std::fmt::LowerHex` and `std::fmt::UpperHex`.
181*/
182pub trait ScanFromHex<'a>: Sized {
183    /**
184    Perform a scan on the given input.
185
186    See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from).
187    */
188    fn scan_from_hex<I: ScanInput<'a>>(s: I) -> Result<(Self, usize), ScanError>;
189}
190
191/**
192This trait defines the interface for runtime scanners.
193
194Runtime scanners must be created before they can be used, but this allows their behaviour to be modified at runtime.
195*/
196pub trait ScanStr<'a>: Sized {
197    /**
198    The type that the implementation scans into.
199    */
200    type Output;
201
202    /**
203    Perform a scan on the given input.
204
205    See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from).
206    */
207    fn scan<I: ScanInput<'a>>(&mut self, s: I) -> Result<(Self::Output, usize), ScanError>;
208
209    /**
210    Indicates whether or not the scanner wants its input to have leading "junk", such as whitespace, stripped.
211
212    There is no default implementation of this for runtime scanners, because almost all runtime scanners forward on to some *other* scanner, and it is *that* scanner that should typically decide what to do.
213
214    Thus, in most cases, your implementation of this method should simply defer to the *next* scanner.
215
216    See: [`ScanFromStr::wants_leading_junk_stripped`](trait.ScanFromStr.html#tymethod.wants_leading_junk_stripped).
217    */
218    fn wants_leading_junk_stripped(&self) -> bool;
219}