Expand description
This crate provides some macros for quickly parsing values out of text. Roughly speaking, it does the inverse of the print!
/format!
macros; or, in other words, a similar job to scanf
from C.
The macros of interest are:
readln!
- reads and scans a line from standard input.try_readln!
- likereadln!
, except it returns aResult
instead of panicking.scan!
- scans the provided string.
Plus two convenience macros:
let_scan!
- scans a string and binds captured values directly to local variables. Only supports one pattern and panics if it doesn’t match.let_readln!
- reads and scans a line from standard input, binding captured values directly to local variables. Only supports one pattern and panics if it doesn’t match.
If you are interested in implementing support for your own types, see the ScanFromStr
and ScanStr
traits.
The provided scanners can be found in the scanner
module.
§Compatibility
scan-rules
is compatible with rustc
version 1.6.0 and higher.
-
Due to a breaking change,
scan-rules
is not compatible withregex
version 0.1.66 or higher. -
rustc
< 1.10 will not have thelet_readln!
macro. -
rustc
< 1.7 will have only concrete implementations ofScanFromStr
for theEverything
,Ident
,Line
,NonSpace
,Number
,Word
, andWordish
scanners for&str
andString
output types. 1.7 and higher will have generic implementations for all output types such that&str: Into<Output>
. -
rustc
< 1.6 is explicitly not supported, due to breaking changes in Rust itself.
§Features
The following optional features are available:
-
arrays-32
: implement scanning for arrays of up to 32 elements. The default is up to 8 elements. -
duration-iso8601-dates
: support scanning ISO 8601 durations with date components. -
regex
: include support for there
,re_a
, andre_str
regular expression-based runtime scanners. Adds a dependency on theregex
crate. -
tuples-16
: implement scanning for tuples of up to 16 elements. The default is up to 4 elements. -
unicode-normalization
: include support forNormalized
andIgnoreCaseNormalized
cursor types. Adds a dependency on theunicode-normalization
crate.
The following are only supported on nightly compilers, and may disappear/change at any time:
nightly-pattern
: adds theuntil_pat
,until_pat_a
, anduntil_pat_str
runtime scanners usingPattern
s.
§Important Notes
-
There are no default scanners for
&str
orString
; if you want a string, you should pick an appropriate abstract scanner from thescanner
module. -
The macros in this crate are extremely complex. Moderately complex usage can exhaust the standard macro recursion limit. If this happens, you can raise the limit (from its default of 64) by adding the following attribute to your crate’s root module:
#![recursion_limit="128"]
§Quick Examples
Here is a simple CLI program that asks the user their name and age. You can run this using cargo run --example ask_age
.
#[macro_use] extern crate scan_rules;
use scan_rules::scanner::Word;
fn main() {
print!("What's your name? ");
let name: String = readln! { (let name: Word<String>) => name };
// ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ rule
// ^~~^ body
// ^~~~~~~~~~~~~~~~~~~~~~~^ pattern
// ^~~~~~~~~~~~~~~~~~~~~^ variable binding
print!("Hi, {}. How old are you? ", name);
readln! {
(let age) => {
// ^~~~~~^ implicitly typed variable binding
let age: i32 = age;
println!("{} years old, huh? Neat.", age);
},
(..other) => println!("`{}` doesn't *look* like a number...", other),
// ^~~~~~^ bind to any input "left over"
}
print!("Ok. What... is your favourite colour? (R, G, B): ");
let_readln!(let r: f32, ",", let g: f32, ",", let b: f32);
// ^~~~^ ^~~~^ ^~~~^
// Scans and binds three variables without nesting scope.
// Panics if *anything* goes wrong.
if !(g < r && g < b && b >= r * 0.25 && b <= r * 0.75) {
println!("Purple's better.");
} else {
println!("Good choice!");
}
}
This example shows how to parse one of several different syntaxes. You can run this using cargo run --example scan_data
.
#[macro_use] extern crate scan_rules;
use std::collections::BTreeSet;
// `Word` is an "abstract" scanner; rather than scanning itself, it scans some
// *other* type using custom rules. In this case, it scans a word into a
// string slice. You can use `Word<String>` to get an owned string.
use scan_rules::scanner::Word;
#[derive(Debug)]
enum Data {
Vector(i32, i32, i32),
Truthy(bool),
Words(Vec<String>),
Lucky(BTreeSet<i32>),
Other(String),
}
fn main() {
print!("Enter some data: ");
let data = readln! {
("<", let x, ",", let y, ",", let z, ">") => Data::Vector(x, y, z),
// ^ pattern terms are comma-separated
// ^~^ literal text match
// Rules are tried top-to-bottom, stopping as soon as one matches.
(let b) => Data::Truthy(b),
("yes") => Data::Truthy(true),
("no") => Data::Truthy(false),
("words:", [ let words: Word<String> ],+) => Data::Words(words),
// ^~~~~~~~~~~~~~~~~~~~~~~~~~~~^ repetition pattern
// ^ one or more matches
// ^ matches must be comma-separated
("lucky numbers:", [ let ns: i32 ]*: BTreeSet<_>) => Data::Lucky(ns),
// collect into specific type ^~~~~~~~~~~~^
// ^ zero or more (you might be unlucky!)
// (no separator this time)
// Rather than scanning a sequence of values and collecting them into
// a `BTreeSet`, we can instead scan the `BTreeSet` *directly*. This
// scans the syntax `BTreeSet` uses when printed using `{:?}`:
// `{1, 5, 13, ...}`.
("lucky numbers:", let ns) => Data::Lucky(ns),
(..other) => Data::Other(String::from(other))
};
println!("data: {:?}", data);
}
This example demonstrates using runtime scanners and the let_scan!
convenience macro. You can run this using cargo run --example runtime_scanners
.
//! **NOTE**: requires the `regex` feature.
#[macro_use] extern crate scan_rules;
fn main() {
use scan_rules::scanner::{
NonSpace, Number, Word, // static scanners
max_width_a, exact_width_a, re_str, // runtime scanners
};
// Adapted example from <http://en.cppreference.com/w/cpp/io/c/fscanf>.
let inp = "25 54.32E-1 Thompson 56789 0123 56ß水";
// `let_scan!` avoids the need for indentation and braces, but only supports
// a single pattern, and panics if anything goes wrong.
let_scan!(inp; (
let i: i32, let x: f32, let str1 <| max_width_a::<NonSpace>(9),
// use runtime scanner ^~~~~~~~~~~~~~~~~~~~~~~~~~~~^
// limit maximum width of a... ^~~~~~~~~~^
// ...static NonSpace scanner... ^~~~~~~^
// 9 bytes ^
let j <| exact_width_a::<i32>(2), let y: f32, let _: Number,
// ^~~~~~~~~~~~~~~~~~~~~~~~~^ scan an i32 with exactly 2 digits
let str2 <| re_str(r"^[0-9]{1,3}"), let warr: Word
// ^~~~~~~~~~~~~~~~~~~~~~~~^ scan using a regular expression
));
println!(
"Converted fields:\n\
i = {i:?}\n\
x = {x:?}\n\
str1 = {str1:?}\n\
j = {j:?}\n\
y = {y:?}\n\
str2 = {str2:?}\n\
warr = {warr:?}",
i=i, j=j, x=x, y=y,
str1=str1, str2=str2, warr=warr);
}
§Rule Syntax
Scanning rules are written as one or more arms like so:
scan! { input_expression;
( pattern ) => body,
( pattern ) => body,
...
( pattern ) => body,
}
Note that the trailing comma on the last rule is optional.
Rules are checked top-to-bottom, stopping at the first that matches.
Patterns (explained under “Pattern Syntax”) must be enclosed in parentheses. If a pattern matches the provided input, the corresponding body is evaluated.
§Pattern Syntax
A scanning pattern is made up of one or more pattern terms, separated by commas. The following terms are supported:
-
strings - any expression that evaluates to a string will be used as a literal match on the input. Exactly how this match is done depends on the kind of input, but the default is to do a case-sensitive match of whole words, individual non-letter characters, and to ignore all whitespace.
E.g.
"Two words"
,"..."
(counts as three “words”),&format!("{} {}", "Two", "words")
. -
let
name [:
type ] - scans a value out of the input text, and binds it to name. If type is omitted, it will be inferred.E.g.
let x
,let n: i32
,let words: Vec<_>
,let _: &str
(scans and discards a value). -
let
name<|
expression - scans a value out of the input text and binds it to name, using the value of expression to perform the scan. The expression must evaluate to something that implements theScanStr
trait.E.g.
let n <| scan_a::<i32>()
(same as above example forn
),let three_digits <| max_width_a::<u32>()
(scan a three-digitu32
). -
..
name - binds the remaining, unscanned input as a string to name. This can only appear as the final term in a top-level pattern. -
[
pattern]
[ (nothing) |,
|(
seperator pattern)
] (?
|*
|+
|{
range}
) [ “:” collection type ] - scans pattern repeatedly.The first (mandatory) part of the term specifies the pattern that should be repeatedly scanned.
The second (optional) part of the term controls if (and what) repeats are separated by.
,
is provided as a short-cut to an obvious common case; it is equivalent to writing(",")
. Otherwise, you may write any arbitrary separator pattern as the separator, including variable bindings and more repetitions.The third (mandatory) part of the term specifies how many times pattern should be scanned. The available options are:
?
- match zero or one times.*
- match zero or more times.+
- match one or more times.{n}
- match exactly n times.{a,}
- match at least a times.{,b}
- match at most b times.{a, b}
- match at least a times, and at most b times.
The fourth (optional) part of the term specifies what type of collection scanned values should be added to. Note that the type specified here applies to all values captured by this repetition. As such, you typically want to use a partially inferred type such as
BTreeSet<_>
. If omitted, it defaults toVec<_>
.E.g.
[ let nums: i32 ],+
,[ "pretty" ]*, "please"
.
Modules§
- input
- This module contains items related to input handling.
- scanner
- This module defines various scanners that can be used to extract values from input text.
Macros§
- let_
readln - Reads a line of text from standard input, then scans it using the specified pattern. All values are bound directly to local variables.
- let_
scan - Scans the provided input, using the specified pattern. All values are bound directly to local variables.
- readln
- Reads a line of text from standard input, then scans it using the provided rules. The result of the
readln!
invocation is the type of the rule bodies; just as withmatch
, all bodies must agree on their result type. - scan
- Scans the provided input, using the specified rules. The result is a
Result<T, ScanError>
whereT
is the type of the rule bodies; just as withmatch
, all bodies must agree on their result type. - try_
readln - Reads a line of text from standard input, then scans it using the provided rules. The result of the
try_readln!
invocation is aResult<T, ScanError>
, whereT
is the type of the rule bodies; just as withmatch
, all bodies must agree on their result type.
Structs§
- Scan
Error - Represents an error that occurred during scanning.
- Scan
Error At - Represents the position at which an error occurred.
Enums§
- Scan
Error Kind - Indicates the kind of error that occurred during scanning.