[][src]Crate jetscii

A tiny library to efficiently search strings for sets of ASCII characters or byte slices for sets of bytes.

Examples

Searching for a set of ASCII characters

#[macro_use]
extern crate jetscii;

fn main() {
    let part_number = "86-J52:rev1";
    let first = ascii_chars!('-', ':').find(part_number);
    assert_eq!(first, Some(2));
}

Searching for a set of bytes

#[macro_use]
extern crate jetscii;

fn main() {
    let raw_data = [0x00, 0x01, 0x10, 0xFF, 0x42];
    let first = bytes!(0x01, 0x10).find(&raw_data);
    assert_eq!(first, Some(1));
}

Searching for a substring

use jetscii::Substring;

let colors = "red, blue, green";
let first = Substring::new(", ").find(colors);
assert_eq!(first, Some(3));

Searching for a subslice

use jetscii::ByteSubstring;

let raw_data = [0x00, 0x01, 0x10, 0xFF, 0x42];
let first = ByteSubstring::new(&[0x10, 0xFF]).find(&raw_data);
assert_eq!(first, Some(2));

Using the pattern API

If this crate is compiled with the unstable pattern feature flag, AsciiChars will implement the Pattern trait, allowing it to be used with many traditional methods.

#[macro_use]
extern crate jetscii;

fn main() {
    let part_number = "86-J52:rev1";
    let parts: Vec<_> = part_number.split(ascii_chars!('-', ':')).collect();
    assert_eq!(&parts, &["86", "J52", "rev1"]);
}
use jetscii::Substring;

let colors = "red, blue, green";
let colors: Vec<_> = colors.split(Substring::new(", ")).collect();
assert_eq!(&colors, &["red", "blue", "green"]);

What's so special about this library?

We use a particular set of x86-64 SSE 4.2 instructions (PCMPESTRI and PCMPESTRM) to gain great speedups. This method stays fast even when searching for a byte in a set of up to 16 choices.

When the PCMPxSTRx instructions are not available, we fall back to reasonably fast but universally-supported methods.

Benchmarks

These numbers come from running on my personal laptop; always benchmark with data and machines similar to your own.

Single character

Searching a 5MiB string of as with a single space at the end for a space:

Method Speed
ascii_chars!(' ').find(s) 11504 MB/s
s.as_bytes().iter().position(|&c| c == b' ') 2377 MB/s
s.find(" ") 2149 MB/s
s.find(&[' '][..]) 1151 MB/s
s.find(' ') 14600 MB/s
s.find(|c| c == ' ') 1080 MB/s

Set of 3 characters

Searching a 5MiB string of as with a single ampersand at the end for <, >, and &:

Method Speed
ascii_chars!(/* ... */).find(s) 11513 MB/s
s.as_bytes().iter().position(|&c| /* ... */) 1644 MB/s
s.find(&[/* ... */][..]) 1079 MB/s
s.find(|c| /* ... */)) 1084 MB/s

Set of 5 characters

Searching a 5MiB string of as with a single ampersand at the end for <, >, &, ', and ":

Method Speed
ascii_chars!(/* ... */).find(s) 11504 MB/s
s.as_bytes().iter().position(|&c| /* ... */) 812 MB/s
s.find(&[/* ... */][..])) 538 MB/s
s.find(|c| /* ... */) 1082 MB/s

Substring

Searching a 5MiB string of as with the string "xyzzy" at the end for "xyzzy":

Method Speed
Substring::new("xyzzy").find(s) 11475 MB/s
s.find("xyzzy") 5391 MB/s

Macros

ascii_chars

A convenience constructor for an AsciiChars that automatically implements a fallback. Provide 1 to 16 characters.

bytes

A convenience constructor for a Bytes that automatically implements a fallback. Provide 1 to 16 characters.

Structs

AsciiChars

Searches a string for a set of ASCII characters. Up to 16 characters may be used.

ByteSubstring

Searches a slice for the first occurence of the subslice.

Bytes

Searches a slice for a set of bytes. Up to 16 bytes may be used.

Substring

Searches a string for the first occurence of the substring.

Type Definitions

AsciiCharsConst

A convenience type that can be used in a constant or static.

ByteSubstringConst

A convenience type that can be used in a constant or static.

BytesConst

A convenience type that can be used in a constant or static.

SubstringConst

A convenience type that can be used in a constant or static.