pub struct HeuristicEstimator;Expand description
Byte / character heuristic tuned for modern LLM tokenizers.
- ASCII bytes are counted as
ceil(bytes / 4)- the OpenAI rule of thumb, accurate within ~20% for English prose undercl100k_baseando200k_basetokenizers. - Non-ASCII characters (Unicode scalars outside
[0x00, 0x7F]) are counted asceil(chars / 1.5)- roughly one token per CJK glyph and two per emoji or Arabic/Cyrillic run, again within ~25% of actual tokenizer output.
The two contributions are summed. Good enough for budget packing; swap in a real tokenizer for exact accounting.
Trait Implementations§
Source§impl Clone for HeuristicEstimator
impl Clone for HeuristicEstimator
Source§fn clone(&self) -> HeuristicEstimator
fn clone(&self) -> HeuristicEstimator
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for HeuristicEstimator
impl Debug for HeuristicEstimator
Source§impl Default for HeuristicEstimator
impl Default for HeuristicEstimator
Source§fn default() -> HeuristicEstimator
fn default() -> HeuristicEstimator
Returns the “default value” for a type. Read more
Source§impl TokenEstimator for HeuristicEstimator
impl TokenEstimator for HeuristicEstimator
impl Copy for HeuristicEstimator
Auto Trait Implementations§
impl Freeze for HeuristicEstimator
impl RefUnwindSafe for HeuristicEstimator
impl Send for HeuristicEstimator
impl Sync for HeuristicEstimator
impl Unpin for HeuristicEstimator
impl UnsafeUnpin for HeuristicEstimator
impl UnwindSafe for HeuristicEstimator
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more