Skip to main content

StreamingDecoder

Struct StreamingDecoder 

Source
pub struct StreamingDecoder<'a> { /* private fields */ }
Expand description

A streaming decoder that yields well-formed UTF-8 slices as tokens arrive.

The decoder holds a reference to its parent OxiTokenizer so that special-token handling, vocabulary lookup and byte-level decoding remain consistent with OxiTokenizer::decode.

Implementations§

Source§

impl<'a> StreamingDecoder<'a>

Source

pub fn new(tokenizer: &'a OxiTokenizer) -> Self

Create a fresh decoder tied to tokenizer.

Source

pub fn push_token(&mut self, id: u32) -> Option<String>

Push a single token ID and return the next well-formed UTF-8 slice, if any. Returns None when the token’s bytes do not extend any previously-pending prefix into a full UTF-8 character.

The returned String contains all characters that became complete as a result of this push — may be multiple characters if the token carries several whole code points.

Source

pub fn push_tokens(&mut self, ids: &[u32]) -> Option<String>

Push many tokens at once. Equivalent to repeatedly calling Self::push_token but only returns once, with all complete characters concatenated.

Source

pub fn finish(self) -> TokenizerResult<String>

Finish the stream and return any remaining bytes as a String.

Returns an error if the pending buffer still contains an incomplete UTF-8 sequence (strict mode). If lossy finishing is desired, use Self::finish_lossy instead.

Source

pub fn finish_lossy(self) -> String

Finish the stream, replacing any trailing invalid bytes with \u{FFFD}. Never fails.

Source

pub fn pending_len(&self) -> usize

Number of bytes currently held in the pending buffer.

A non-zero value after a push_token call indicates that the last token ended mid-UTF-8-sequence.

Source

pub fn reset(&mut self)

Reset the decoder state without destroying the OxiTokenizer reference — useful when processing multiple independent streams.

Source

pub fn total_bytes(&self) -> usize

Total bytes processed since construction or last Self::reset.

Source

pub fn total_tokens(&self) -> usize

Total tokens processed since construction or last Self::reset.

Auto Trait Implementations§

§

impl<'a> Freeze for StreamingDecoder<'a>

§

impl<'a> RefUnwindSafe for StreamingDecoder<'a>

§

impl<'a> Send for StreamingDecoder<'a>

§

impl<'a> Sync for StreamingDecoder<'a>

§

impl<'a> Unpin for StreamingDecoder<'a>

§

impl<'a> UnsafeUnpin for StreamingDecoder<'a>

§

impl<'a> UnwindSafe for StreamingDecoder<'a>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more