Skip to main content

Crate sentencepiece_rs

Crate sentencepiece_rs 

Source
Expand description

SentencePiece runtime in Rust.

This crate loads existing SentencePiece .model / .spm files and exposes a small processor API for normalization, encoding, and decoding.

Structs§

Normalizer
SentencePiece-compatible normalizer.
Piece
A vocabulary entry from a SentencePiece model.
SentencePieceModel
Loaded SentencePiece model and vocabulary metadata.
SentencePieceProcessor
Main API for loading a SentencePiece model and tokenizing text.

Enums§

Error
Errors returned by model loading, normalization, encoding, and decoding.
ModelType
PieceType

Constants§

DEFAULT_UNKNOWN_SURFACE
Default decoded surface for the <unk> piece.
REPLACEMENT_CHARACTER
Unicode replacement character, U+FFFD.
SPACE_SYMBOL
SentencePiece’s visible whitespace marker, U+2581 LOWER ONE EIGHT BLOCK.

Type Aliases§

Result
Crate-wide result type.