# A Rust crate providing an `sscanf`-like macro (inverse of `format!()`), with near unlimited parsing capabilities
[](https://github.com/mich101mich/sscanf/actions/workflows/test.yml)
[](https://crates.io/crates/sscanf)
[](https://docs.rs/sscanf/)
[](https://deps.rs/repo/github/mich101mich/sscanf)
`sscanf` is originally a C function that takes a string, a format string with placeholders, and
several variables. It parses the input and writes matched values into those variables. In Rust,
this crate returns a tuple instead. You can think of it as reversing a call to `format!()`:
```rust
// format: takes format string and values, returns String
let msg = format!("Hello {}{}!", "World", 5);
assert_eq!(msg, "Hello World5!");
// sscanf: takes string, format string and types, returns tuple
let parsed = sscanf::sscanf!(msg, "Hello {}{}!", &str, usize);
// parsed is Option<(&str, usize)>
assert_eq!(parsed.unwrap(), ("World", 5));
// alternative syntax:
let parsed2 = sscanf::sscanf!(msg, "Hello {&str}{usize}!");
assert_eq!(parsed2.unwrap(), ("World", 5));
```
`sscanf!()` takes a format string like `format!()`, but instead of writing values into `{}` placeholders,
it extracts the values at those positions into the returned tuple.
If matching the format string fails, `None` is returned:
```rust
let msg = "Text that doesn't match the format string";
let parsed = sscanf::sscanf!(msg, "Hello {&str}{usize}!");
assert!(parsed.is_none());
```
**Types in Placeholders:**
The types can either be given as a separate parameter after the format string, or directly
inside the `{}` placeholder.
The first allows for autocomplete while typing, syntax highlighting and better compiler errors
generated by sscanf in case that the wrong types are given.
The second mirrors the [captured identifiers in format strings](https://blog.rust-lang.org/2022/01/13/Rust-1.58.0.html#captured-identifiers-in-format-strings).
This option has less helpful compiler errors on stable Rust, but is otherwise identical to the first.
More examples of the capabilities of `sscanf`:
```rust
use sscanf::sscanf;
use std::num::NonZeroUsize;
let input = "<x=3, y=-6, z=6>";
let parsed = sscanf!(input, "<x={i32}, y={i32}, z={i32}>");
assert_eq!(parsed.unwrap(), (3, -6, 6));
let input = "Move to N36E21";
let parsed = sscanf!(input, "Move to {char}{usize}{char}{usize}");
assert_eq!(parsed.unwrap(), ('N', 36, 'E', 21));
let input = "Escape literal { } as {{ and }}";
let parsed = sscanf!(input, "Escape literal {{ }} as {{{{ and }}}}");
assert_eq!(parsed.unwrap(), ());
let input = "Indexing types: N36E21";
let parsed = sscanf!(input, "Indexing types: {1}{0}{1}{0}", NonZeroUsize, char);
// output is in the order of the placeholders
assert_eq!(parsed.unwrap(), ('N', NonZeroUsize::new(36).unwrap(),
'E', NonZeroUsize::new(21).unwrap()));
let input = "A Sentence with Spaces. Another Sentence.";
// &str and String do the same, but String clones from the input string
// to take ownership instead of borrowing.
let (a, b) = sscanf!(input, "{String}. {&str}.").unwrap();
assert_eq!(a, "A Sentence with Spaces");
assert_eq!(b, "Another Sentence");
// Number format options
let input = "ab01 127 101010 1Z";
let parsed = sscanf!(input, "{usize:x} {i32:o} {u8:b} {u32:r36}");
let (a, b, c, d) = parsed.unwrap();
assert_eq!(a, 0xab01); // Hexadecimal
assert_eq!(b, 0o127); // Octal
assert_eq!(c, 0b101010); // Binary
assert_eq!(d, 71); // any radix (r36 = Radix 36)
assert_eq!(d, u32::from_str_radix("1Z", 36).unwrap());
let input = "color: #D4AF37";
// Number types take their size into account, and hexadecimal u8 can
// have at most 2 digits => only possible match is 2 digits each.
let (r, g, b) = sscanf!(input, "color: #{u8:x}{u8:x}{u8:x}").unwrap();
assert_eq!((r, g, b), (0xD4, 0xAF, 0x37));
```
The input here is a `&'static str`, but it can be `String`, `&str`, `&String`, ...
Basically anything that auto-derefs to `str` without taking ownership. See [input examples]
for a few examples of possible inputs.
The parsing part of this macro has very few limitations, since it replaces the `{}` with a
[Regular Expression][regex] that corresponds to that type.
For example:
- `char` is just one character (regex `"."`)
- `str` is any sequence of characters (regex `".+?"`)
- Numbers are any sequence of digits (regex `"[-+]?\d+"`)
And so on. The actual implementation for numbers tries to take the size of the type into
account and some other details, but that is the gist of the parsing.
This means that any sequence of replacements is possible as long as the regex finds a
combination that works. In the `char, usize, char, usize` example above it manages to assign
the `N` and `E` to the `char`s because they cannot be matched by the `usize`s.
[input examples]: https://docs.rs/sscanf/latest/sscanf/macro.sscanf.html#examples
## Format Options
All options are inside `'{'` `'}'` and after a `:`, so either as `{<type>:<option>}` or
as `{:<option>}`. Note: The type might still have a path that contains `::`. Any double
colons are ignored and only single colons are used to separate the options.
**Custom Regex:**
- `{:/.../}`: Match according to the [regex] between the `/` `/`
For example:
```rust
let input = "random Text";
let parsed = sscanf::sscanf!(input, "{&str:/[^m]+/}{&str}");
// regex [^m]+ matches anything that isn't an 'm'
// => stops at the 'm' in 'random'
assert_eq!(parsed.unwrap(), ("rando", "m Text"));
```
The regex uses the [`same escaping logic as JavaScripts /.../ syntax`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions#escaping),
meaning that the normal regex escaping with `\d` for digits etc. is in effect, with the addition
that any `/` need to be escaped as `\/` since they are used to end the regex.
**NOTE:** You should use raw strings for a format string containing a regex, since otherwise you
need to escape any `\` as `\\`:
```rust
use sscanf::sscanf;
let input = "1234";
let parsed = sscanf!(input, r"{u8:/\d{2}/}{u8}"); // regex \d{2} matches 2 digits
let _ = sscanf!(input, "{u8:/\\d{2}/}{u8}"); // the same with a non-raw string
assert_eq!(parsed.unwrap(), (12, 34));
```
See [`trait AcceptsRegexOverride`][regex trait] for types supporting this option and instructions for adding support to
custom types.
**Radix Options:**
Generally only work on primitive integer types (`u8`, ..., `u128`, `i8`, ..., `i128`, `usize`, `isize`).
- `x`: hexadecimal Number (Digits 0-9 and a-f or A-F), optional prefix `0x` or `0X`
- `o`: octal Number (Digits 0-7), optional prefix `0o` or `0O`
- `b`: binary Number (Digits 0-1), optional prefix `0b` or `0B`
- `r2` - `r36`: any radix Number (Digits 0-9 and a-z or A-Z for higher radices)
If used alongside a `#`: makes the number require a prefix (0x, 0o, 0b).
A note on prefixes: `r2`, `r8` and `r16` match the same numbers as `b`, `o` and `x` respectively,
but without a prefix. Thus:
- `{:x}` _may_ have a prefix, matching numbers like `0xab` or `ab`
- `{:r16}` has no prefix and would only match `ab`
- `{:#x}` _must_ have a prefix, matching only `0xab`
- `{:#r16}` gives a compile error
## Custom Types
`sscanf` works with most primitive Types from `std` as well as `String` by default. The
full list can be seen here: [Implementations of `FromScanf`](https://docs.rs/sscanf/latest/sscanf/trait.FromScanf.html#foreign-impls).
To add more types there are two options:
- Derive [`FromScanf`][derive] for your type (simple, readable, fool proof (mostly))
- Manually implement [`FromScanfSimple`][simple] (more flexible, more code)
- Manually implement [`FromScanf`][trait] for your type (flexible, but requires more code)
The simplest option is to use `derive`:
```rust
#[derive(sscanf::FromScanf)] // The derive macro
#[derive(Debug, PartialEq)] // additional traits for assert_eq below. Not required for sscanf
#[sscanf(format = "{numerator}/{denominator}")] // Format string for the type, using the field names.
struct Fraction {
numerator: isize,
denominator: usize,
}
let parsed = sscanf::sscanf!("-10/3", "{Fraction}").unwrap();
assert_eq!(parsed, Fraction { numerator: -10, denominator: 3 });
```
Also works for enums:
```rust
#[derive(sscanf::FromScanf)]
enum HasChanged {
#[sscanf(format = "received {added} additions and {deleted} deletions")]
Yes {
added: usize,
deleted: usize,
},
#[sscanf("has not changed")] // the `format =` part can be omitted
No
}
let input = "Your file has not changed since your last visit!";
let parsed = sscanf::sscanf!(input, "Your file {HasChanged} since your last visit!").unwrap();
assert!(matches!(parsed, HasChanged::No));
let input = "Your file received 325 additions and 15 deletions since your last visit!";
let parsed = sscanf::sscanf!(input, "Your file {HasChanged} since your last visit!").unwrap();
assert!(matches!(parsed, HasChanged::Yes { added: 325, deleted: 15 }));
```
More details can be found in the [trait `FromScanf`][trait] and [derive `FromScanf`][derive] documentations.
## Changelog
See [Changelog.md](https://github.com/mich101mich/sscanf/blob/master/Changelog.md)
## License
Licensed under either of [Apache License, Version 2.0](LICENSE-APACHE) or
[MIT license](LICENSE-MIT) at your option.
[trait]: https://docs.rs/sscanf/latest/sscanf/trait.FromScanf.html
[derive]: https://docs.rs/sscanf/latest/sscanf/derive.FromScanf.html
[simple]: https://docs.rs/sscanf/latest/sscanf/trait.FromScanfSimple.html
[regex trait]: https://docs.rs/sscanf/latest/sscanf/advanced/trait.AcceptsRegexOverride.html
[regex]: https://docs.rs/regex