pub struct ParserCore { /* private fields */ }Expand description
The parser core.
This struct provides all the basic parsing primitives used elsewhere. To use it, make
a decoder and pass it, along with a name for the source, to ParserCore::new().
In general you will probably prefer to use Parser instead, which will provide all
the functionality of the core, plus additional helper methods.
This struct exists to break a dependency cycle in the architecture.
Implementations§
Source§impl ParserCore
impl ParserCore
Sourcepub fn new<D: Decoder + 'static>(name: &str, decoder: D) -> Self
pub fn new<D: Decoder + 'static>(name: &str, decoder: D) -> Self
Create a new parser using the given decoder as the source of characters. A name is given
that will be used when creating Loc instances.
Sourcepub fn loc(&self) -> Loc
pub fn loc(&self) -> Loc
Get the current location in the parse. This will return either a console (if the name is the empty string) or a file location (if the name was not the empty string).
Sourcepub fn get_column_number(&self) -> usize
pub fn get_column_number(&self) -> usize
Get the current one-based column number. This may be useful when parsing languages
in which indentation is significant, but otherwise you will probably prefer to use
Self::loc().
Sourcepub fn get_line_number(&self) -> usize
pub fn get_line_number(&self) -> usize
Get the current one-based line number. For most uses you will probably find
Self::loc() to be more useful.
Sourcepub fn replace_whitespace_test(
&mut self,
test: Box<dyn Fn(char) -> bool>,
) -> Box<dyn Fn(char) -> bool>
pub fn replace_whitespace_test( &mut self, test: Box<dyn Fn(char) -> bool>, ) -> Box<dyn Fn(char) -> bool>
Define whitespace. This takes a closure that returns true for whitespace and false
otherwise. The prior whitespace test is returned.
Examples found in repository?
122pub fn main() {
123 let mut parser = trivet::parse_from_stdin();
124 parser.parse_comments = false;
125 let numpar = parser.borrow_number_parser();
126 numpar.settings.permit_binary = false;
127 numpar.settings.permit_hexadecimal = false;
128 numpar.settings.permit_octal = false;
129 numpar.settings.permit_underscores = false;
130 numpar.settings.decimal_only_floats = true;
131 numpar.settings.permit_plus = false;
132 numpar.settings.permit_leading_zero = false;
133 numpar.settings.permit_empty_whole = false;
134 numpar.settings.permit_empty_fraction = false;
135 let strpar = parser.borrow_string_parser();
136 strpar.set(trivet::strings::StringStandard::JSON);
137 let _ = parser
138 .borrow_core()
139 .replace_whitespace_test(Box::new(|ch| [' ', '\n', '\r', '\t'].contains(&ch)));
140 parser.consume_ws();
141 let result = parse_value_ws(&mut parser);
142 match result {
143 Err(error) => {
144 println!("ERROR: {}", error);
145 std::process::exit(1);
146 }
147 Ok(json) => {
148 // If there is any trailing stuff that is not whitespace, then this is not a valid
149 // JSON file.
150 if parser.is_at_eof() {
151 // Print the JSON value.
152 println!("{:?}", json);
153 std::process::exit(0)
154 } else {
155 println!("Found unexpected trailing characters after JSON value.");
156 std::process::exit(1);
157 }
158 }
159 }
160}Sourcepub fn is_at_eof(&self) -> bool
pub fn is_at_eof(&self) -> bool
Determine if the parser has reached the end of the stream. If this is true, then no further characters are available from this parser.
Sourcepub fn peek(&mut self) -> char
pub fn peek(&mut self) -> char
Peek at the next character in the stream. In order to be as fast as is reasonable,
no stream checking is done. If the stream is at the end, then you should get null
characters, but you should not rely on that, since the null is also a valid character
in a file. Instead, be sure to check Self::is_at_eof.
If this method is invoked too many times without any characters being consumed, then it
will panic to indicate that parsing has stalled. See PEEK_LIMIT.
Sourcepub fn consume(&mut self)
pub fn consume(&mut self)
Consume the next character from the stream, if there is one. If not, then do nothing.
If this method is invoked too many times after reaching the end of file, then it will panic
to indicate that parsing has stalled. See EOF_LIMIT.
Sourcepub fn peek_offset(&mut self, n: usize) -> char
pub fn peek_offset(&mut self, n: usize) -> char
Peek at an offset in the stream. That is, peek at a character at a given position.
The position index is zero-based, with the next character to read (the result of
a simple Self::peek) being at index zero.
If there are not enough characters in the stream, then null (\0) is returned.
The distance is limited by the maximum lookahead; attempts to look past it will
also return a null.
Note the distinction between this method and Self::peek_n; peek_n(1) method
will return the character at position zero, so it is equivalent to peek() and
to peek_offset(0).
Sourcepub fn peek_n_vec(&mut self, n: usize) -> Vec<char>
pub fn peek_n_vec(&mut self, n: usize) -> Vec<char>
Peek at characters in the stream. If there are fewer than n characters in the
stream, then fewer are returned. If the stream is exhausted, an empty vector is
returned.
If this method is invoked too many times without any characters being consumed, then it
will panic to indicate that parsing has stalled. See PEEK_LIMIT.
This method is similar to Self::peek_n, but does not construct a string for the
result, which can be better in some cases.
Sourcepub fn peek_n(&mut self, n: usize) -> String
pub fn peek_n(&mut self, n: usize) -> String
Peek at characters in the stream. If there are fewer than n characters in the
stream, then fewer are returned. If the stream is exhausted, an empty string is
returned.
If this method is invoked too many times without any characters being consumed, then it
will panic to indicate that parsing has stalled. See PEEK_LIMIT.
Sourcepub fn consume_n(&mut self, n: usize)
pub fn consume_n(&mut self, n: usize)
Consume a given number of characters from the stream. The end of file is not checked during this. If there are no characters to consume, nothing is done.
If this method is invoked too many times after reaching the end of file, it will panic to
indicate that parsing has stalled. See EOF_LIMIT.
Sourcepub fn peek_chars(&mut self, chars: &[char]) -> bool
pub fn peek_chars(&mut self, chars: &[char]) -> bool
Check the next characters in the stream. If the next characters exactly match those given in the vector, in order, then true is returned. Otherwise false is returned. Nothing is consumed.
Sourcepub fn peek_and_consume(&mut self, ch: char) -> bool
pub fn peek_and_consume(&mut self, ch: char) -> bool
Peek at the next character in the stream. If it is the given character, consume it and return true. Otherwise return false.
Sourcepub fn peek_and_consume_chars(&mut self, chars: &[char]) -> bool
pub fn peek_and_consume_chars(&mut self, chars: &[char]) -> bool
Check the next characters in the stream and, if they match in order, consume them and return true. Otherwise return false.
Examples found in repository?
1fn main() {
2 // Text to parse. Note that the comment ends on line 5 at column 12, with
3 // the first non-comment position at column 13.
4 let mut parser = trivet::parse_from_string(
5 r#"
6 --[[
7 I am a long form
8 Lua comment.
9 --]]"#,
10 );
11 parser.borrow_comment_parser().enable_c = false;
12 parser.borrow_comment_parser().enable_cpp = false;
13 parser.borrow_comment_parser().custom = Box::new(|parser: &mut trivet::ParserCore| -> bool {
14 if parser.peek_and_consume_chars(&['-', '-', '[', '[']) {
15 parser.take_until("--]]");
16 true
17 } else if parser.peek_and_consume_chars(&['-', '-']) {
18 parser.take_while(|ch| ch != '\n');
19 true
20 } else {
21 false
22 }
23 });
24 parser.borrow_comment_parser().enable_custom = true;
25 parser.consume_ws();
26 assert_eq!(parser.loc().to_string(), "<string>:5:13");
27 assert!(parser.is_at_eof());
28}Sourcepub fn consume_ws_only(&mut self) -> bool
pub fn consume_ws_only(&mut self) -> bool
Consume all whitespace starting at the current position. The definition of whitespace used here is the same as the Unicode standard.
At the time of writing, the following is the definition of whitespace used.
0009..000D ; White_Space # Cc [5] <control-0009>..<control-000D>
0020 ; White_Space # Zs SPACE
0085 ; White_Space # Cc <control-0085>
00A0 ; White_Space # Zs NO-BREAK SPACE
1680 ; White_Space # Zs OGHAM SPACE MARK
2000..200A ; White_Space # Zs [11] EN QUAD..HAIR SPACE
2028 ; White_Space # Zl LINE SEPARATOR
2029 ; White_Space # Zp PARAGRAPH SEPARATOR
202F ; White_Space # Zs NARROW NO-BREAK SPACE
205F ; White_Space # Zs MEDIUM MATHEMATICAL SPACE
3000 ; White_Space # Zs IDEOGRAPHIC SPACESourcepub fn take_until(&mut self, token: &str) -> String
pub fn take_until(&mut self, token: &str) -> String
Consume characters until an end token is found. The characters consumed are returned without the end token, though the end token is also consumed.
Examples found in repository?
1fn main() {
2 // Text to parse. Note that the comment ends on line 5 at column 12, with
3 // the first non-comment position at column 13.
4 let mut parser = trivet::parse_from_string(
5 r#"
6 --[[
7 I am a long form
8 Lua comment.
9 --]]"#,
10 );
11 parser.borrow_comment_parser().enable_c = false;
12 parser.borrow_comment_parser().enable_cpp = false;
13 parser.borrow_comment_parser().custom = Box::new(|parser: &mut trivet::ParserCore| -> bool {
14 if parser.peek_and_consume_chars(&['-', '-', '[', '[']) {
15 parser.take_until("--]]");
16 true
17 } else if parser.peek_and_consume_chars(&['-', '-']) {
18 parser.take_while(|ch| ch != '\n');
19 true
20 } else {
21 false
22 }
23 });
24 parser.borrow_comment_parser().enable_custom = true;
25 parser.consume_ws();
26 assert_eq!(parser.loc().to_string(), "<string>:5:13");
27 assert!(parser.is_at_eof());
28}Sourcepub fn take_while<T: Fn(char) -> bool>(&mut self, include: T) -> String
pub fn take_while<T: Fn(char) -> bool>(&mut self, include: T) -> String
Consume characters so long as the test is true. Return the characters consumed, if any.
Examples found in repository?
1fn main() {
2 // Text to parse. Note that the comment ends on line 5 at column 12, with
3 // the first non-comment position at column 13.
4 let mut parser = trivet::parse_from_string(
5 r#"
6 --[[
7 I am a long form
8 Lua comment.
9 --]]"#,
10 );
11 parser.borrow_comment_parser().enable_c = false;
12 parser.borrow_comment_parser().enable_cpp = false;
13 parser.borrow_comment_parser().custom = Box::new(|parser: &mut trivet::ParserCore| -> bool {
14 if parser.peek_and_consume_chars(&['-', '-', '[', '[']) {
15 parser.take_until("--]]");
16 true
17 } else if parser.peek_and_consume_chars(&['-', '-']) {
18 parser.take_while(|ch| ch != '\n');
19 true
20 } else {
21 false
22 }
23 });
24 parser.borrow_comment_parser().enable_custom = true;
25 parser.consume_ws();
26 assert_eq!(parser.loc().to_string(), "<string>:5:13");
27 assert!(parser.is_at_eof());
28}Sourcepub fn take_while_unless<T: Fn(char) -> bool, U: Fn(char) -> bool>(
&mut self,
include: T,
exclude: U,
) -> String
pub fn take_while_unless<T: Fn(char) -> bool, U: Fn(char) -> bool>( &mut self, include: T, exclude: U, ) -> String
Consume characters so long as either test is true. Return only those characters that satisfy the first test. The exclude predicate is checked first.
Sourcepub fn take<S, K>(&mut self, skip: S, stop: K) -> (Vec<char>, Option<char>)
pub fn take<S, K>(&mut self, skip: S, stop: K) -> (Vec<char>, Option<char>)
Consume and return characters. This works as follows.
If the current character satisfies skip, then the character is skipped.
If the current character satisfies stop, then the parse is stopped and the result is returned,
regardless of whether any other predicates match.
Other characters (those that do not match skip or stop) are collected and returned.
Note that skip is checked first, then stop. This means the following code works as expected.
use trivet::parse_from_string;
let mut parser = parse_from_string("12_232.14");
assert_eq!(parser.take(
|ch| ch == '_',
|ch| ch != '.' && !ch.is_alphanumeric()
), ("12232.14".chars().collect(), None));Also note that the following code will ignore the stop setting since it is never reached during checking.
use trivet::parse_from_string;
let mut parser = parse_from_string("12_232.14");
assert_eq!(parser.take(
|ch| ch == '_',
|ch| ch == '_'
), ("12232.14".chars().collect(), None));The returned pair contains all matched characters and the character that caused the stop, or
None if parsing stopped because the end of stream was reached. Note that the character that
caused the stop is not consumed.
Sourcepub fn consume_while<T: Fn(char) -> bool>(&mut self, include: T) -> bool
pub fn consume_while<T: Fn(char) -> bool>(&mut self, include: T) -> bool
Consume characters so long as the test is true. Returns true if any characters are consumed.
Sourcepub fn consume_until(&mut self, token: &str) -> bool
pub fn consume_until(&mut self, token: &str) -> bool
Consume characters until the given end token is found. Returns true if any characters are consumed. The end token is also consumed. This stops at the first occurrence of the end token; that is, it is not greedy.