Expand description
Speaker segmentation: powerset-classifier + sliding-window aggregator.
Added in v0.6 (M1).
Structs§
- Aggregation
Config - Configuration for aggregation.
- Aggregator
- Aggregator over sliding-window powerset outputs.
- Frame
Label - Decoded label for a single audio frame.
- Powerset
Config - Tunable parameters for
PowersetSegmenter. - Powerset
Decoder - Stateless decoder; methods are associated functions because no per-instance configuration is needed.
- Powerset
Segmenter - ONNX-backed powerset speaker segmenter.
- RawSegment
- One contiguous segment attributed to a single local speaker index.
- Window
Output - One window’s segmentation output.
Enums§
- Powerset
Class - One of the seven powerset classes, identifying which speakers are active.
- Segmentation
Error - Errors from
Segmenterimplementations.
Constants§
- MIN_
AUDIO_ SAMPLES - Minimum audio length (16 kHz samples) accepted by
Segmenter::segment.
Traits§
- Segmenter
- A speaker segmentation engine — turns raw audio into spans of speech attributed to local speaker indices, with overlap detection.