Skip to main content

Module chunker

Module chunker 

Source
Expand description

Audio chunking aligned to VAD segments.

This module segments standardized PCM audio into speech-aligned chunks with timing and overlap metadata.

§Architecture

The chunker follows a streaming-first design:

  1. Accept VAD boundaries (SpeechChunk) + raw PCM samples
  2. Generate fixed-duration chunks (default 500ms) aligned to speech boundaries
  3. Attach temporal metadata (AudioTimestamp) for deterministic testing
  4. Attach quality metrics such as energy and speech ratio

§Performance Contracts

  • Latency: <15ms total processing per chunk
  • Alignment: ±20ms accuracy to VAD boundaries
  • Coverage: Chunks cover 100% of input duration (no gaps)

§Example

use speech_prep::{Chunker, ChunkerConfig, SpeechChunk};
use speech_prep::time::{AudioDuration, AudioTimestamp};

let config = ChunkerConfig::default(); // 500ms chunks
let chunker = Chunker::new(config);

let audio: Vec<f32> = vec![0.0; 16000]; // 1 second @ 16kHz
let vad_segments = vec![SpeechChunk {
    start_time:  AudioTimestamp::EPOCH,
    end_time:    AudioTimestamp::EPOCH
        .add_duration(AudioDuration::from_secs(1)),
    confidence:  0.9,
    avg_energy:  0.5,
    frame_count: 50,
}];

let chunks = chunker.chunk(&audio, 16000, &vad_segments)?;
assert_eq!(chunks.len(), 2); // Two 500ms chunks from 1s speech

// Overlaps are automatically added between chunks
assert!(chunks[0].overlap_next.is_some()); // First chunk has overlap for next
assert!(chunks[1].overlap_prev.is_some()); // Second chunk has overlap from prev

Structs§

Chunker
Audio chunker for segmenting streams into processing units.
ChunkerConfig
Configuration for the audio chunker.
ProcessedChunk
A processed audio chunk with temporal and quality metadata.

Enums§

ChunkBoundary
Type of boundary at chunk edges.