use-token 0.1.0

Composable tokenization primitives for RustUse.

Coverage
100%
21 out of 21 items documented1 out of 10 items with examples
Size
Source code size: 11.2 kB This is the summed size of all the files inside the crates.io package for this release.
Documentation size: 518.09 kB This is the summed size of all files generated by rustdoc for all configured targets
Ø build duration
this release: 10s Average build duration of successful builds.
all releases: 10s Average build duration of successful builds in releases after 2024-10-23.
Links
Homepage
RustUse/use-text
1 0 0
crates.io
Dependencies
Versions
- 0.1.0 (2026-05-13)
Owners

use-token

Composable tokenization primitives for RustUse.

use-token keeps tokenization explicit and small. It handles whitespace splitting, conservative word tokenization, lightweight sentence boundaries, and character spans without claiming to be a full NLP parser.

Included primitives

tokenize_whitespace
tokenize_words
tokenize_sentences
tokenize_chars
token_count

Example

use use_token::{token_count, tokenize_sentences, tokenize_words};

assert_eq!(token_count("Hello, world!"), 2);
assert_eq!(tokenize_words("don't stop").len(), 2);
assert_eq!(tokenize_sentences("One. Two!").len(), 2);