Skip to main content

Module batch_encoder

Module batch_encoder 

Source
Expand description

Batch text encoding pipeline with chunking and pooling strategies.

Provides deterministic text-to-embedding conversion with configurable pooling, normalization, and similarity computation — all without external ML dependencies.

Structs§

BatchEncoder
Batch text encoder: tokenises, embeds, pools, and normalises text.
EncodedBatch
The output of encoding a batch of texts.
EncodingConfig
Configuration for the batch encoder.
TokenizedText
A tokenised representation of a single text string.

Enums§

PoolingStrategy
Pooling strategy for aggregating token-level embeddings.