1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219
/* Copyright ⓒ 2016 Daniel Keep. Licensed under the MIT license (see LICENSE or <http://opensource.org /licenses/MIT>) or the Apache License, Version 2.0 (see LICENSE of <http://www.apache.org/licenses/LICENSE-2.0>), at your option. All files in the project carrying such notice may not be copied, modified, or distributed except according to those terms. */ /*! This module defines various scanners that can be used to extract values from input text. ## Kinds of Scanner Scanners can be classified as "static self scanners", "static abstract scanners", and "runtime abstract scanners". * "Static self scanners" are types which implement the `ScanFromStr` trait and output an instance of themselves. For example, if you scan using the `i32` type, you get an `i32` result. These are implemented for types which have an obvious "default" scanning syntax. As a consequence of outputting an instance of themselves, they *also* automatically implement the `ScanSelfFromStr` trait. * "Static abstract scanners" are types which implement the `ScanFromStr` trait and output an instance of *some other* type. For example, if you scan using the `Word` type, you get a `&str` or `String` result. These are implemented for cases where different rules are desireable, such as scanning particular *subsets* of a type (see `Word`, `Number`, `NonSpace`), or non-default encodings (see `Binary`, `Octal`, `Hex`). * "Runtime abstract scanners" implement the `ScanStr` trait and serve the same overall function as static abstract scanners, except that the scanner *itself* must be constructed. In other words, static scanners are types, runtime scanners are *values*. This makes them a little less straightforward to use, but they are *significantly* more flexible. They can be parameterised at runtime, to perform arbitrary manipulations of both the text input and scanned values (see `max_width`, `re_str`). ## Bridging Between Static and Runtime Scanners A scanner of interest is `ScanA<Type>`. This is a runtime scanner which takes a *static* scanner as a type parameter. This allows you to use a static scanner in a context where a runtime scanner is needed. For example, these two bindings are equivalent in terms of behaviour: ```ignore // Scan a u32. let _: u32 let _ <| scan_a::<u32>() ``` ## Creating Runtime Scanners Runtime scanners are typically constructed using functions, rather than dealing with the implementing type itself. For example, to get an instance of the `ExactWidth` runtime scanner, you would call either the `exact_width` or `exact_width_a` functions. The reason for two functions is that most runtime scanners accept a *second* runtime scanner for the purposes of chaining. This allows several transformations to be applied outside-in. For example, you can combine runtime scanners together like so: ```ignore // Scan a word of between 2 and 5 bytes. let _ <| min_width(2, max_width(5, scan_a::<Word>())) ``` Functions ending in `_a` are a shorthand for the common case of wrapping a runtime scanner around a static scanner. For example, the following two patterns are equivalent: ```ignore // Scan a u32 that has, at most, four digits. let _ <| max_width(4, scan_a::<u32>()) let _ <| max_width_a::<u32>(4) ``` */ /* It is also where implementations for existing standard and external types are kept, though these do not appear in the documentation. */ pub use self::misc::{ Everything, HorSpace, Newline, NonSpace, Space, Ident, Line, Number, Word, Wordish, Inferred, KeyValuePair, QuotedString, Binary, Octal, Hex, }; #[doc(inline)] pub use self::runtime::{ exact_width, exact_width_a, max_width, max_width_a, min_width, min_width_a, scan_a, }; #[cfg(feature="regex")] #[doc(inline)] pub use self::runtime::{re, re_a, re_str}; #[cfg(feature="nightly-pattern")] #[doc(inline)] pub use self::runtime::{until_pat, until_pat_a, until_pat_str}; #[macro_use] mod macros; pub mod runtime; pub mod std; mod lang; mod misc; use ::ScanError; use ::input::ScanInput; /** This trait defines the interface to a type which can be scanned. The exact syntax scanned is entirely arbitrary, though there are some rules of thumb that implementations should *generally* stick to: * Do not ignore leading whitespace. * Do not eagerly consume trailing whitespace, unless it is legitimately part of the scanned syntax. In addition, if you are implementing scanning directly for the result type (*i.e.* `Output = Self`), prefer parsing *only* the result of the type's `Debug` implementation. This ensures that there is a degree of round-tripping between `format!` and `scan!`. If a type has multiple legitimate parsing forms, consider defining those alternate forms on abstract scanner types (*i.e.* `Output != Self`) instead. See: [`ScanSelfFromStr`](trait.ScanSelfFromStr.html). */ pub trait ScanFromStr<'a>: Sized { /** The type that the implementation scans into. This *does not* have to be the same as the implementing type, although it typically *will* be. See: [`ScanSelfFromStr::scan_self_from`](trait.ScanSelfFromStr.html#method.scan_self_from). */ type Output; /** Perform a scan on the given input. Implementations must return *either* the scanned value, and the number of bytes consumed from the input, *or* a reason why scanning failed. */ fn scan_from<I: ScanInput<'a>>(s: I) -> Result<(Self::Output, usize), ScanError>; /** Indicates whether or not the scanner wants its input to have leading "junk", such as whitespace, stripped. The default implementation returns `true`, which is almost *always* the correct answer. You should only implement this explicitly (and return `false`) if you are implementing a scanner for which leading whitespace is important. */ fn wants_leading_junk_stripped() -> bool { true } } /** This is a convenience trait automatically implemented for all scanners which result in themselves (*i.e.* `ScanFromStr::Output = Self`). This exists to aid type inference. See: [`ScanFromStr`](trait.ScanFromStr.html). */ pub trait ScanSelfFromStr<'a>: ScanFromStr<'a, Output=Self> { /** Perform a scan on the given input. See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from). */ fn scan_self_from<I: ScanInput<'a>>(s: I) -> Result<(Self, usize), ScanError> { Self::scan_from(s) } } impl<'a, T> ScanSelfFromStr<'a> for T where T: ScanFromStr<'a, Output=T> {} /** This trait defines scanning a type from a binary representation. This should be implemented to match implementations of `std::fmt::Binary`. */ pub trait ScanFromBinary<'a>: Sized { /** Perform a scan on the given input. See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from). */ fn scan_from_binary<I: ScanInput<'a>>(s: I) -> Result<(Self, usize), ScanError>; } /** This trait defines scanning a type from an octal representation. This should be implemented to match implementations of `std::fmt::Octal`. */ pub trait ScanFromOctal<'a>: Sized { /** Perform a scan on the given input. See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from). */ fn scan_from_octal<I: ScanInput<'a>>(s: I) -> Result<(Self, usize), ScanError>; } /** This trait defines scanning a type from a hexadecimal representation. This should be implemented to match implementations of `std::fmt::LowerHex` and `std::fmt::UpperHex`. */ pub trait ScanFromHex<'a>: Sized { /** Perform a scan on the given input. See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from). */ fn scan_from_hex<I: ScanInput<'a>>(s: I) -> Result<(Self, usize), ScanError>; } /** This trait defines the interface for runtime scanners. Runtime scanners must be created before they can be used, but this allows their behaviour to be modified at runtime. */ pub trait ScanStr<'a>: Sized { /** The type that the implementation scans into. */ type Output; /** Perform a scan on the given input. See: [`ScanFromStr::scan_from`](trait.ScanFromStr.html#tymethod.scan_from). */ fn scan<I: ScanInput<'a>>(&mut self, s: I) -> Result<(Self::Output, usize), ScanError>; /** Indicates whether or not the scanner wants its input to have leading "junk", such as whitespace, stripped. There is no default implementation of this for runtime scanners, because almost all runtime scanners forward on to some *other* scanner, and it is *that* scanner that should typically decide what to do. Thus, in most cases, your implementation of this method should simply defer to the *next* scanner. See: [`ScanFromStr::wants_leading_junk_stripped`](trait.ScanFromStr.html#tymethod.wants_leading_junk_stripped). */ fn wants_leading_junk_stripped(&self) -> bool; }