Skip to main content

Chunker

speech_prep::chunker

Struct Chunker

pub struct Chunker { /* private fields */ }

Expand description

Audio chunker for segmenting streams into processing units.

Combines VAD boundaries with duration heuristics to produce chunks optimized for downstream processing by downstream consumers.

Implementations§

impl Chunker

pub fn new(config: ChunkerConfig) -> Self

Create a new chunker with the given configuration.

pub fn default() -> Self

Create a chunker with default configuration (500ms chunks).

Alias for Chunker::new(ChunkerConfig::default()).

pub fn chunk( &self, audio: &[f32], sample_rate: u32, vad_segments: &[SpeechChunk], ) -> Result<Vec<ProcessedChunk>>

Segment audio into processing chunks aligned to VAD boundaries.

This variant assumes that VAD timestamps are relative to the Unix epoch (e.g., tests that build times off AudioTimestamp::EPOCH). For streaming scenarios where VAD emits wall-clock timestamps (AudioTimestamp::now()), prefer Chunker::chunk_with_stream_start so the chunker can normalize against the actual stream start.

§Arguments

audio: Raw PCM samples (f32, normalized to [-1.0, 1.0])
sample_rate: Audio sample rate in Hz (must be > 0)
vad_segments: Speech boundaries from VAD analysis

§Returns

Vector of ProcessedChunk covering the entire input duration with no gaps.

§Errors

Returns Error::InvalidInput if:

sample_rate is zero
audio is empty
VAD segments have invalid timestamps (end < start)

§Performance

Target: <15ms total processing time per chunk generated.

pub fn chunk_with_stream_start( &self, audio: &[f32], sample_rate: u32, vad_segments: &[SpeechChunk], stream_start_time: AudioTimestamp, ) -> Result<Vec<ProcessedChunk>>

Segment audio into processing chunks with an explicit stream start time.

Use this variant when the VAD timestamps are absolute (e.g., wall-clock) rather than relative to the Unix epoch.

use speech_prep::{Chunker, ChunkerConfig, SpeechChunk};
use speech_prep::time::{AudioDuration, AudioTimestamp};

let chunker = Chunker::new(ChunkerConfig::streaming());
let stream_start = AudioTimestamp::EPOCH;

// VAD emits wall-clock timestamps relative to the live stream
let segments = vec![SpeechChunk {
    start_time:  stream_start,
    end_time:    stream_start.add_duration(AudioDuration::from_millis(240)),
    confidence:  0.92,
    avg_energy:  0.4,
    frame_count: 48,
}];

let audio = vec![0.0; 3840]; // 240ms @ 16kHz
let chunks = chunker.chunk_with_stream_start(&audio, 16_000, &segments, stream_start)?;
assert_eq!(chunks.len(), 1);

Trait Implementations§

impl Clone for Chunker

fn clone(&self) -> Chunker

Returns a duplicate of the value. Read more

1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

impl Debug for Chunker

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

impl Copy for Chunker

Auto Trait Implementations§

impl Freeze for Chunker

impl RefUnwindSafe for Chunker

impl Send for Chunker

impl Sync for Chunker

impl Unpin for Chunker

impl UnsafeUnpin for Chunker

impl UnwindSafe for Chunker

Blanket Implementations§

impl<T> Any for T
where T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T
where T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> CloneToUninit for T
where T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)

Performs copy-assignment from self to dest. Read more

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more

impl<T, U> Into<U> for T
where U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<F, T> IntoSample<T> for F
where T: FromSample<F>,

fn into_sample(self) -> T

impl<T> ToOwned for T
where T: Clone,

type Owned = T

The resulting type after obtaining ownership.

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more

impl<T, U> TryFrom<U> for T
where U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.

impl<T> WithSubscriber for T

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more