[][src]Macro safe_arch::string_search_for_mask

macro_rules! string_search_for_mask {
    ([$needle:expr, $needle_len:expr], [$haystack:expr, $haystack_len:expr], $char_type:tt, $search_op:tt, $mask_style:tt) => { ... };
    ($needle:expr, $haystack:expr, $char_type:tt, $search_op:tt, $mask_style:tt) => { ... };
    (@_char_type u8) => { ... };
    (@_char_type u16) => { ... };
    (@_char_type i8) => { ... };
    (@_char_type i16) => { ... };
    (@_char_type $unknown:tt) => { ... };
    (@_search_op EqAny) => { ... };
    (@_search_op CmpRanges) => { ... };
    (@_search_op CmpEqEach) => { ... };
    (@_search_op CmpEqOrdered) => { ... };
    (@_search_op $unknown:tt) => { ... };
    (@_mask_style BitMask) => { ... };
    (@_mask_style UnitMask) => { ... };
    (@_mask_style $unknown:tt) => { ... };
    (@_raw_explicit_len $needle:expr, $needle_len:expr, $haystack:expr, $haystack_len:expr, $imm:expr) => { ... };
    (@_raw_implicit_len $needle:expr, $haystack:expr, $imm:expr) => { ... };
}
This is supported with target feature sse4.2 only.

Looks for $needle in $haystack and gives the mask of where the matches were.

This is a fairly flexible operation, and so I apologize in advance.

  • The "needle" is the string you're looking for.
  • The "haystack" is the string you're looking inside of.
  • The lengths of each string can be "explicit" or "implicit".
    • "explicit" is specified with [str, len] pairs.
    • "implicit" just ends at the first \0.
    • Either way a string doesn't go past the end of the register.
  • You need to pick a "char type", which can be any of u8, i8, u16, i16. These operations always operate on m128i registers, but the interpretation of the data is configurable.
  • You need to pick the search operation, which determines how the needle is compared to the haystack:
    • EqAny: Matches when any haystack character equals any needle character, regardless of position.
    • CmpRanges: Interprets consecutive pairs of characters in the needle as (low..=high) ranges to compare each haystack character to.
    • CmpEqEach: Matches when a character position in the needle is equal to the character at the same position in the haystack.
    • CmpEqOrdered: Matches when the complete needle string is a substring somewhere in the haystack.
  • Finally, you need to specify if you want to have a BitMask or a UnitMask.
    • With a BitMask, each bit in the output will be set if there was a match at that position of the haystack.
    • In the UnitMask case, each "unit" in the output will be set if there's a match at that position in the haystack. The size of a unit is set by the "char type" you select (either 8 bits or 16 bits at a time).

It's a lot to take in. Hopefully the examples below can help clarify how things work. They all use u8 since Rust string literals are UTF-8, but it's the same with the other character types.

EqAny

let hay: m128i = m128i::from(*b"some test words.");

// explicit needle length
let needle: m128i = m128i::from(*b"e_______________");
let i: u128 =
  string_search_for_mask!([needle, 1], [hay, 16], u8, EqAny, BitMask).into();
assert_eq!(i, 0b0000000001001000);
let i: [i8; 16] =
  string_search_for_mask!([needle, 1], [hay, 16], u8, EqAny, UnitMask).into();
assert_eq!(i, [0, 0, 0, -1, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0]);

// implicit needle length
let needle: m128i = m128i::from(*b"e\0______________");
let i: u128 = string_search_for_mask!(needle, hay, u8, EqAny, BitMask).into();
assert_eq!(i, 0b0000000001001000);

// more than one needle character will match any of them, though we
// don't get info about _which_ needle character matched.
let needle: m128i = m128i::from(*b"et\0_____________");
let i: u128 = string_search_for_mask!(needle, hay, u8, EqAny, BitMask).into();
assert_eq!(i, 0b0000000101101000);

CmpRanges

let hay: m128i = m128i::from(*b"some test words.");
let needle: m128i = m128i::from(*b"am\0_____________");
let i: u128 =
  string_search_for_mask!(needle, hay, u8, CmpRanges, BitMask).into();
assert_eq!(i, 0b0010000001001100);

CmpEqEach

let hay: m128i = m128i::from(*b"some test words.");
let needle: m128i = m128i::from(*b"_____test_______");
let i: u128 =
  string_search_for_mask!(needle, hay, u8, CmpEqEach, BitMask).into();
assert_eq!(i, 0b0000000111100000);

CmpEqOrdered

let hay: m128i = m128i::from(*b"some test words.");
let needle: m128i = m128i::from(*b"words\0__________");
let i: u128 =
  string_search_for_mask!(needle, hay, u8, CmpEqOrdered, BitMask).into();
assert_eq!(i, 0b00000010000000000); // one bit at the start of the match