Expand description

A tiny library to efficiently search strings for sets of ASCII characters or byte slices for sets of bytes.

Examples

Searching for a set of ASCII characters

#[macro_use]
extern crate jetscii;

fn main() {
    let part_number = "86-J52:rev1";
    let first = ascii_chars!('-', ':').find(part_number);
    assert_eq!(first, Some(2));
}

Searching for a set of bytes

#[macro_use]
extern crate jetscii;

fn main() {
    let raw_data = [0x00, 0x01, 0x10, 0xFF, 0x42];
    let first = bytes!(0x01, 0x10).find(&raw_data);
    assert_eq!(first, Some(1));
}

Searching for a substring

use jetscii::Substring;

let colors = "red, blue, green";
let first = Substring::new(", ").find(colors);
assert_eq!(first, Some(3));

Searching for a subslice

use jetscii::ByteSubstring;

let raw_data = [0x00, 0x01, 0x10, 0xFF, 0x42];
let first = ByteSubstring::new(&[0x10, 0xFF]).find(&raw_data);
assert_eq!(first, Some(2));

Using the pattern API

If this crate is compiled with the unstable pattern feature flag, AsciiChars will implement the Pattern trait, allowing it to be used with many traditional methods.

#[macro_use]
extern crate jetscii;

fn main() {
    let part_number = "86-J52:rev1";
    let parts: Vec<_> = part_number.split(ascii_chars!('-', ':')).collect();
    assert_eq!(&parts, &["86", "J52", "rev1"]);
}
use jetscii::Substring;

let colors = "red, blue, green";
let colors: Vec<_> = colors.split(Substring::new(", ")).collect();
assert_eq!(&colors, &["red", "blue", "green"]);

What’s so special about this library?

We use a particular set of SSE 4.2 instructions (PCMPESTRI and PCMPESTRM) to gain great speedups. This method stays fast even when searching for a byte in a set of up to 16 choices.

When the PCMPxSTRx instructions are not available, we fall back to reasonably fast but universally-supported methods.

Benchmarks

These numbers come from running on my personal laptop; always benchmark with data and machines similar to your own.

Single character

Searching a 5MiB string of as with a single space at the end for a space:

MethodSpeed
ascii_chars!(’ ’).find(s)11504 MB/s
s.as_bytes().iter().position(|&c| c == b’ ’)2377 MB/s
s.find(“ “)2149 MB/s
s.find(&[’ ’][..])1151 MB/s
s.find(’ ’)14600 MB/s
s.find(|c| c == ’ ’)1080 MB/s

Set of 3 characters

Searching a 5MiB string of as with a single ampersand at the end for <, >, and &:

MethodSpeed
ascii_chars!(/* … */).find(s)11513 MB/s
s.as_bytes().iter().position(|&c| /* … */)1644 MB/s
s.find(&[/* … */][..])1079 MB/s
s.find(|c| /* … */))1084 MB/s

Set of 5 characters

Searching a 5MiB string of as with a single ampersand at the end for <, >, &, ', and ":

MethodSpeed
ascii_chars!(/* … */).find(s)11504 MB/s
s.as_bytes().iter().position(|&c| /* … */)812 MB/s
s.find(&[/* … */][..]))538 MB/s
s.find(|c| /* … */)1082 MB/s

Substring

Searching a 5MiB string of as with the string “xyzzy” at the end for “xyzzy”:

MethodSpeed
Substring::new(“xyzzy”).find(s)11475 MB/s
s.find(“xyzzy”)5391 MB/s

Macros

A convenience constructor for an AsciiChars that automatically implements a fallback. Provide 1 to 16 characters.

A convenience constructor for a Bytes that automatically implements a fallback. Provide 1 to 16 characters.

Structs

Searches a string for a set of ASCII characters. Up to 16 characters may be used.

Searches a slice for the first occurence of the subslice.

Searches a slice for a set of bytes. Up to 16 bytes may be used.

Searches a string for the first occurence of the substring.

Type Definitions

A convenience type that can be used in a constant or static.

A convenience type that can be used in a constant or static.

A convenience type that can be used in a constant or static.

A convenience type that can be used in a constant or static.