pub struct WordSymbolizer { /* private fields */ }Expand description
Symbolizer for text input using word-level frequency binning.
Tokenizes on whitespace, computes word frequencies, and bins words into
num_symbols frequency tiers. This detects semantic redundancy that
byte-level symbolizers miss (e.g., “Home About Blog Contact” has low S_T).
§Example
use epsilon_engine::symbolize::WordSymbolizer;
let text = "the quick brown fox jumps over the lazy dog the fox";
let symbolizer = WordSymbolizer::new(4);
let symbols = symbolizer.symbolize(text).unwrap();
// "the" (3×) and "fox" (2×) map to high-frequency bins
// "quick", "brown", etc. (1×) map to low-frequency binsImplementations§
Source§impl WordSymbolizer
impl WordSymbolizer
Sourcepub fn symbolize(&self, text: &str) -> Result<Vec<u8>, SymbolizeError>
pub fn symbolize(&self, text: &str) -> Result<Vec<u8>, SymbolizeError>
Symbolize text into a sequence of frequency-binned symbols.
§Algorithm
- Tokenize on whitespace (split by
char::is_whitespace) - Count word frequencies
- Bin words into
num_symbolsequal-frequency tiers - Map each word occurrence to its tier symbol
§Errors
Returns SymbolizeError::EmptyInput if text contains no words.
Trait Implementations§
Source§impl Clone for WordSymbolizer
impl Clone for WordSymbolizer
Source§fn clone(&self) -> WordSymbolizer
fn clone(&self) -> WordSymbolizer
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreAuto Trait Implementations§
impl Freeze for WordSymbolizer
impl RefUnwindSafe for WordSymbolizer
impl Send for WordSymbolizer
impl Sync for WordSymbolizer
impl Unpin for WordSymbolizer
impl UnsafeUnpin for WordSymbolizer
impl UnwindSafe for WordSymbolizer
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more