Skip to main content

Module config

Module config 

Source
Expand description

Sampling configuration types.

Structs§

ChunkingStrategy
Controls how long text sections are chunked and weighted.
DenoiserConfig
Configuration for the OCR denoiser that filters digit-heavy text.
SamplerConfig
Top-level sampler configuration.
TextRecipe
Defines how to build a text sample from a record.
TripletRecipe
Defines a triplet recipe (anchor/positive/negative selection + weighting).

Enums§

NegativeStrategy
Strategy for picking the negative record in a triplet.
Selector
Selector for choosing a section or neighboring record.