pub struct TokenizerBridge { /* private fields */ }Expand description
Thin wrapper around tokenizers::Tokenizer.
On non-WASM targets, delegates to the full HuggingFace tokenizers library.
On WASM targets, all methods return a RuntimeError::Tokenizer error.
Implementations§
Source§impl TokenizerBridge
impl TokenizerBridge
Sourcepub fn from_file(path: &str) -> RuntimeResult<Self>
pub fn from_file(path: &str) -> RuntimeResult<Self>
Load a tokenizer from a JSON file.
Sourcepub fn decode(&self, ids: &[u32]) -> RuntimeResult<String>
pub fn decode(&self, ids: &[u32]) -> RuntimeResult<String>
Decode token IDs to text.
Sourcepub fn vocab_size(&self) -> usize
pub fn vocab_size(&self) -> usize
Get the vocabulary size.
Sourcepub fn new_decode_stream(&self, skip_special_tokens: bool) -> DecodeStreamState
pub fn new_decode_stream(&self, skip_special_tokens: bool) -> DecodeStreamState
Create a fresh decode-stream state for one generation request.
See DecodeStreamState and Self::step_decode for the streaming
decode protocol. Use this instead of repeatedly calling
Self::decode with single-token slices, which mishandles tokens that
straddle UTF-8 codepoint boundaries.
Sourcepub fn step_decode(
&self,
state: &mut DecodeStreamState,
id: u32,
) -> RuntimeResult<Option<String>>
pub fn step_decode( &self, state: &mut DecodeStreamState, id: u32, ) -> RuntimeResult<Option<String>>
Advance the decode stream by one token.
Returns Ok(Some(text)) only when the buffered bytes form a complete
UTF-8 chunk (which may span several previous tokens for CJK / emoji);
returns Ok(None) when more tokens are needed before any well-formed
text can be emitted. Callers must not print the empty string when
Ok(None) is returned — wait for the next token.
Auto Trait Implementations§
impl !Freeze for TokenizerBridge
impl RefUnwindSafe for TokenizerBridge
impl Send for TokenizerBridge
impl Sync for TokenizerBridge
impl Unpin for TokenizerBridge
impl UnsafeUnpin for TokenizerBridge
impl UnwindSafe for TokenizerBridge
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more