Expand description
A tiny library to efficiently search strings for sets of ASCII characters or byte slices for sets of bytes.
§Examples
§Searching for a set of ASCII characters
#[macro_use]
extern crate jetscii;
fn main() {
let part_number = "86-J52:rev1";
let first = ascii_chars!('-', ':').find(part_number);
assert_eq!(first, Some(2));
}
§Searching for a set of bytes
#[macro_use]
extern crate jetscii;
fn main() {
let raw_data = [0x00, 0x01, 0x10, 0xFF, 0x42];
let first = bytes!(0x01, 0x10).find(&raw_data);
assert_eq!(first, Some(1));
}
§Searching for a substring
use jetscii::Substring;
let colors = "red, blue, green";
let first = Substring::new(", ").find(colors);
assert_eq!(first, Some(3));
§Searching for a subslice
use jetscii::ByteSubstring;
let raw_data = [0x00, 0x01, 0x10, 0xFF, 0x42];
let first = ByteSubstring::new(&[0x10, 0xFF]).find(&raw_data);
assert_eq!(first, Some(2));
§Using the pattern API
If this crate is compiled with the unstable pattern
feature
flag, AsciiChars
will implement the
Pattern
trait, allowing it to be
used with many traditional methods.
#[macro_use]
extern crate jetscii;
fn main() {
let part_number = "86-J52:rev1";
let parts: Vec<_> = part_number.split(ascii_chars!('-', ':')).collect();
assert_eq!(&parts, &["86", "J52", "rev1"]);
}
use jetscii::Substring;
let colors = "red, blue, green";
let colors: Vec<_> = colors.split(Substring::new(", ")).collect();
assert_eq!(&colors, &["red", "blue", "green"]);
§What’s so special about this library?
We use a particular set of SSE 4.2 instructions (PCMPESTRI
and PCMPESTRM
) to gain great speedups. This method stays fast even
when searching for a byte in a set of up to 16 choices.
When the PCMPxSTRx
instructions are not available, we fall back to
reasonably fast but universally-supported methods.
§Benchmarks
These numbers come from running on my personal laptop; always benchmark with data and machines similar to your own.
§Single character
Searching a 5MiB string of a
s with a single space at the end for a space:
Method | Speed |
---|---|
ascii_chars!(’ ’).find(s) | 11504 MB/s |
s.as_bytes().iter().position(|&c| c == b’ ’) | 2377 MB/s |
s.find(“ “) | 2149 MB/s |
s.find(&[’ ’][..]) | 1151 MB/s |
s.find(’ ’) | 14600 MB/s |
s.find(|c| c == ’ ’) | 1080 MB/s |
§Set of 3 characters
Searching a 5MiB string of a
s with a single ampersand at the end for <
, >
, and &
:
Method | Speed |
---|---|
ascii_chars!(/* … */).find(s) | 11513 MB/s |
s.as_bytes().iter().position(|&c| /* … */) | 1644 MB/s |
s.find(&[/* … */][..]) | 1079 MB/s |
s.find(|c| /* … */)) | 1084 MB/s |
§Set of 5 characters
Searching a 5MiB string of a
s with a single ampersand at the end for <
, >
, &
, '
, and "
:
Method | Speed |
---|---|
ascii_chars!(/* … */).find(s) | 11504 MB/s |
s.as_bytes().iter().position(|&c| /* … */) | 812 MB/s |
s.find(&[/* … */][..])) | 538 MB/s |
s.find(|c| /* … */) | 1082 MB/s |
§Substring
Searching a 5MiB string of a
s with the string “xyzzy” at the end for “xyzzy”:
Method | Speed |
---|---|
Substring::new(“xyzzy”).find(s) | 11475 MB/s |
s.find(“xyzzy”) | 5391 MB/s |
Macros§
- ascii_
chars - A convenience constructor for an
AsciiChars
that automatically implements a fallback. Provide 1 to 16 characters. - bytes
- A convenience constructor for a
Bytes
that automatically implements a fallback. Provide 1 to 16 characters.
Structs§
- Ascii
Chars - Searches a string for a set of ASCII characters. Up to 16 characters may be used.
- Byte
Substring - Searches a slice for the first occurence of the subslice.
- Bytes
- Searches a slice for a set of bytes. Up to 16 bytes may be used.
- Substring
- Searches a string for the first occurence of the substring.
Type Aliases§
- Ascii
Chars Const - A convenience type that can be used in a constant or static.
- Byte
Substring Const - A convenience type that can be used in a constant or static.
- Bytes
Const - A convenience type that can be used in a constant or static.
- Substring
Const - A convenience type that can be used in a constant or static.