Skip to main content

Module shared

Module shared 

Source
Expand description

SharedDictionary: pre-parsed trie + raw data that can be shared across multiple tokenizer instances via Arc, avoiding ~150 MB per-tokenizer trie duplication.

Structs§

SharedDictionary
Shared dictionary state that can be cloned cheaply across tokenizers.

Enums§

DictData
Dictionary data that can be either an owned Vec<u8> (when mutation was needed for connection inhibitions) or a memory-mapped file (zero-copy, OS-managed pages).