Trait text_splitter::ChunkSizer
source · pub trait ChunkSizer {
// Required method
fn chunk_size(
&self,
chunk: &str,
capacity: &impl ChunkCapacity
) -> ChunkSize;
}Expand description
Determines the size of a given chunk.
Required Methods§
sourcefn chunk_size(&self, chunk: &str, capacity: &impl ChunkCapacity) -> ChunkSize
fn chunk_size(&self, chunk: &str, capacity: &impl ChunkCapacity) -> ChunkSize
Determine the size of a given chunk to use for validation
Object Safety§
This trait is not object safe.
Implementations on Foreign Types§
source§impl ChunkSizer for &CoreBPE
Available on crate feature tiktoken-rs only.
impl ChunkSizer for &CoreBPE
Available on crate feature
tiktoken-rs only.source§fn chunk_size(&self, chunk: &str, capacity: &impl ChunkCapacity) -> ChunkSize
fn chunk_size(&self, chunk: &str, capacity: &impl ChunkCapacity) -> ChunkSize
Returns the number of tokens in a given text after tokenization.
source§impl ChunkSizer for &Tokenizer
Available on crate feature tokenizers only.
impl ChunkSizer for &Tokenizer
Available on crate feature
tokenizers only.source§fn chunk_size(&self, chunk: &str, capacity: &impl ChunkCapacity) -> ChunkSize
fn chunk_size(&self, chunk: &str, capacity: &impl ChunkCapacity) -> ChunkSize
Returns the number of tokens in a given text after tokenization.
Panics
Will panic if you don’t have a byte-level tokenizer and the splitter encounters text it can’t tokenize.
source§impl ChunkSizer for CoreBPE
Available on crate feature tiktoken-rs only.
impl ChunkSizer for CoreBPE
Available on crate feature
tiktoken-rs only.source§fn chunk_size(&self, chunk: &str, capacity: &impl ChunkCapacity) -> ChunkSize
fn chunk_size(&self, chunk: &str, capacity: &impl ChunkCapacity) -> ChunkSize
Returns the number of tokens in a given text after tokenization.
source§impl ChunkSizer for Tokenizer
Available on crate feature tokenizers only.
impl ChunkSizer for Tokenizer
Available on crate feature
tokenizers only.source§fn chunk_size(&self, chunk: &str, capacity: &impl ChunkCapacity) -> ChunkSize
fn chunk_size(&self, chunk: &str, capacity: &impl ChunkCapacity) -> ChunkSize
Returns the number of tokens in a given text after tokenization.
Panics
Will panic if you don’t have a byte-level tokenizer and the splitter encounters text it can’t tokenize.