pub fn count_words(data: &[u8]) -> u64
Count words using locale-aware 3-state logic (default: UTF-8).