scan_rules/scanner/mod.rs
1/*
2Copyright ⓒ 2016 Daniel Keep.
3
4Licensed under the MIT license (see LICENSE or <http://opensource.org
5/licenses/MIT>) or the Apache License, Version 2.0 (see LICENSE of
6<http://www.apache.org/licenses/LICENSE-2.0>), at your option. All
7files in the project carrying such notice may not be copied, modified,
8or distributed except according to those terms.
9*/
10/*!
11This module defines various scanners that can be used to extract values from input text.
12
13## Kinds of Scanner
14
15Scanners can be classified as "static self scanners", "static abstract scanners", and "runtime abstract scanners".
16
17* "Static self scanners" are types which implement the `ScanFromStr` trait and output an instance of themselves. For example, if you scan using the `i32` type, you get an `i32` result. These are implemented for types which have an obvious "default" scanning syntax.
18
19 As a consequence of outputting an instance of themselves, they *also* automatically implement the `ScanSelfFromStr` trait.
20
21* "Static abstract scanners" are types which implement the `ScanFromStr` trait and output an instance of *some other* type. For example, if you scan using the `Word` type, you get a `&str` or `String` result. These are implemented for cases where different rules are desireable, such as scanning particular *subsets* of a type (see `Word`, `Number`, `NonSpace`), or non-default encodings (see `Binary`, `Octal`, `Hex`).
22
23* "Runtime abstract scanners" implement the `ScanStr` trait and serve the same overall function as static abstract scanners, except that the scanner *itself* must be constructed. In other words, static scanners are types, runtime scanners are *values*. This makes them a little less straightforward to use, but they are *significantly* more flexible. They can be parameterised at runtime, to perform arbitrary manipulations of both the text input and scanned values (see `max_width`, `re_str`).
24
25## Bridging Between Static and Runtime Scanners
26
27A scanner of interest is `ScanA<Type>`. This is a runtime scanner which takes a *static* scanner as a type parameter. This allows you to use a static scanner in a context where a runtime scanner is needed.
28
29For example, these two bindings are equivalent in terms of behaviour:
30
31```ignore
32 // Scan a u32.
33 let _: u32
34 let _ <| scan_a::<u32>()
35```
36
37## Creating Runtime Scanners
38
39Runtime scanners are typically constructed using functions, rather than dealing with the implementing type itself. For example, to get an instance of the `ExactWidth` runtime scanner, you would call either the `exact_width` or `exact_width_a` functions.
40
41The reason for two functions is that most runtime scanners accept a *second* runtime scanner for the purposes of chaining. This allows several transformations to be applied outside-in. For example, you can combine runtime scanners together like so:
42
43```ignore
44 // Scan a word of between 2 and 5 bytes.
45 let _ <| min_width(2, max_width(5, scan_a::<Word>()))
46```
47
48Functions ending in `_a` are a shorthand for the common case of wrapping a runtime scanner around a static scanner. For example, the following two patterns are equivalent:
49
50```ignore
51 // Scan a u32 that has, at most, four digits.
52 let _ <| max_width(4, scan_a::<u32>())
53 let _ <| max_width_a::<u32>(4)
54```
55*/
56/*
57It is also where implementations for existing standard and external types are kept, though these do not appear in the documentation.
58*/
59pub use self::misc::{
60 Everything, HorSpace, Newline, NonSpace, Space,
61 Ident, Line, Number, Word, Wordish,
62 Inferred, KeyValuePair, QuotedString,
63 Binary, Octal, Hex,
64};
65
66#[doc(inline)] pub use self::runtime::{
67 exact_width, exact_width_a,
68 max_width, max_width_a,
69 min_width, min_width_a,
70 scan_a,
71};
72
73#[cfg(feature="regex")]
74#[doc(inline)]
75pub use self::runtime::{re, re_a, re_str};
76
77#[cfg(feature="nightly-pattern")]
78#[doc(inline)]
79pub use self::runtime::{until_pat, until_pat_a, until_pat_str};
80
81#[macro_use] mod macros;
82
83pub mod runtime;
84pub mod std;
85
86mod lang;
87mod misc;
88
89use ::ScanError;
90use ::input::ScanInput;
91
92/**
93This trait defines the interface to a type which can be scanned.
94
95The exact syntax scanned is entirely arbitrary, though there are some rules of thumb that implementations should *generally* stick to:
96
97* Do not ignore leading whitespace.
98* Do not eagerly consume trailing whitespace, unless it is legitimately part of the scanned syntax.
99
100In addition, if you are implementing scanning directly for the result type (*i.e.* `Output = Self`), prefer parsing *only* the result of the type's `Debug` implementation. This ensures that there is a degree of round-tripping between `format!` and `scan!`.
101
102If a type has multiple legitimate parsing forms, consider defining those alternate forms on abstract scanner types (*i.e.* `Output != Self`) instead.
103
104See: [`ScanSelfFromStr`](trait.ScanSelfFromStr.html).
105*/
106pub trait ScanFromStr<'a>: Sized {
107 /**
108 The type that the implementation scans into. This *does not* have to be the same as the implementing type, although it typically *will* be.
109
110 See: [`ScanSelfFromStr::scan_self_from`](trait.ScanSelfFromStr.html#method.scan_self_from).
111 */
112 type Output;
113
114 /**
115 Perform a scan on the given input.
116
117 Implementations must return *either* the scanned value, and the number of bytes consumed from the input, *or* a reason why scanning failed.
118 */
119 fn scan_from<I: ScanInput<'a>>(s: I) -> Result<(Self::Output, usize), ScanError>;
120
121 /**
122 Indicates whether or not the scanner wants its input to have leading "junk", such as whitespace, stripped.
123
124 The default implementation returns `true`, which is almost *always* the correct answer. You should only implement this explicitly (and return `false`) if you are implementing a scanner for which leading whitespace is important.
125 */
126 fn wants_leading_junk_stripped() -> bool { true }
127}
128
129/**
130This is a convenience trait automatically implemented for all scanners which result in themselves (*i.e.* `ScanFromStr::Output = Self`).
131
132This exists to aid type inference.
133
134See: [`ScanFromStr`](trait.ScanFromStr.html).
135*/
136pub trait ScanSelfFromStr<'a>: ScanFromStr<'a, Output=Self> {
137 /**
138 Perform a scan on the given input.
139
140 See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from).
141 */
142 fn scan_self_from<I: ScanInput<'a>>(s: I) -> Result<(Self, usize), ScanError> {
143 Self::scan_from(s)
144 }
145}
146
147impl<'a, T> ScanSelfFromStr<'a> for T where T: ScanFromStr<'a, Output=T> {}
148
149/**
150This trait defines scanning a type from a binary representation.
151
152This should be implemented to match implementations of `std::fmt::Binary`.
153*/
154pub trait ScanFromBinary<'a>: Sized {
155 /**
156 Perform a scan on the given input.
157
158 See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from).
159 */
160 fn scan_from_binary<I: ScanInput<'a>>(s: I) -> Result<(Self, usize), ScanError>;
161}
162
163/**
164This trait defines scanning a type from an octal representation.
165
166This should be implemented to match implementations of `std::fmt::Octal`.
167*/
168pub trait ScanFromOctal<'a>: Sized {
169 /**
170 Perform a scan on the given input.
171
172 See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from).
173 */
174 fn scan_from_octal<I: ScanInput<'a>>(s: I) -> Result<(Self, usize), ScanError>;
175}
176
177/**
178This trait defines scanning a type from a hexadecimal representation.
179
180This should be implemented to match implementations of `std::fmt::LowerHex` and `std::fmt::UpperHex`.
181*/
182pub trait ScanFromHex<'a>: Sized {
183 /**
184 Perform a scan on the given input.
185
186 See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from).
187 */
188 fn scan_from_hex<I: ScanInput<'a>>(s: I) -> Result<(Self, usize), ScanError>;
189}
190
191/**
192This trait defines the interface for runtime scanners.
193
194Runtime scanners must be created before they can be used, but this allows their behaviour to be modified at runtime.
195*/
196pub trait ScanStr<'a>: Sized {
197 /**
198 The type that the implementation scans into.
199 */
200 type Output;
201
202 /**
203 Perform a scan on the given input.
204
205 See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from).
206 */
207 fn scan<I: ScanInput<'a>>(&mut self, s: I) -> Result<(Self::Output, usize), ScanError>;
208
209 /**
210 Indicates whether or not the scanner wants its input to have leading "junk", such as whitespace, stripped.
211
212 There is no default implementation of this for runtime scanners, because almost all runtime scanners forward on to some *other* scanner, and it is *that* scanner that should typically decide what to do.
213
214 Thus, in most cases, your implementation of this method should simply defer to the *next* scanner.
215
216 See: [`ScanFromStr::wants_leading_junk_stripped`](trait.ScanFromStr.html#tymethod.wants_leading_junk_stripped).
217 */
218 fn wants_leading_junk_stripped(&self) -> bool;
219}