Skip to main content

Module segmentation

Module segmentation 

Source
Expand description

Speaker segmentation: powerset-classifier + sliding-window aggregator.

Added in v0.6 (M1).

Structs§

AggregationConfig
Configuration for aggregation.
Aggregator
Aggregator over sliding-window powerset outputs.
FrameLabel
Decoded label for a single audio frame.
PowersetConfig
Tunable parameters for PowersetSegmenter.
PowersetDecoder
Stateless decoder; methods are associated functions because no per-instance configuration is needed.
PowersetSegmenter
ONNX-backed powerset speaker segmenter.
RawSegment
One contiguous segment attributed to a single local speaker index.
WindowOutput
One window’s segmentation output.

Enums§

PowersetClass
One of the seven powerset classes, identifying which speakers are active.
SegmentationError
Errors from Segmenter implementations.

Constants§

MIN_AUDIO_SAMPLES
Minimum audio length (16 kHz samples) accepted by Segmenter::segment.

Traits§

Segmenter
A speaker segmentation engine — turns raw audio into spans of speech attributed to local speaker indices, with overlap detection.