Skip to main content

count_words

Function count_words 

Source
pub fn count_words(data: &[u8]) -> u64
Expand description

Count words using SIMD-accelerated whitespace detection + popcount.

A word is a maximal run of non-whitespace bytes (GNU wc definition). Word starts are transitions from whitespace to non-whitespace.

On x86_64, uses SSE2 range comparisons to detect whitespace in 16-byte vectors: (0x09 <= b <= 0x0D) || (b == 0x20). Four vectors are processed per iteration (64 bytes), with movemask combining into a 64-bit bitmask for popcount-based word boundary detection.

Fallback: scalar 64-byte block bitmask approach with table lookup.