tokenmonster 0.1.0

Greedy tiktoken-like tokenizer with embedded vocabulary (cl100k-base approximator)
Documentation
1
2
3
4
5
6
7
8
9
10
11
# tokenmonster

Greedy tiktoken-like tokenizer with an embedded vocabulary, intended for fast, allocation-light tokenization.

Features
- Greedy tokenization compatible with common LLM vocabularies
- Zero-copy where possible; minimal allocations
- Optional tiny test vocabulary via the `tiny_vocab` feature

License: MIT