Skip to main content

Chunker

Struct Chunker 

Source
pub struct Chunker { /* private fields */ }
Expand description

Audio chunker for segmenting streams into processing units.

Combines VAD boundaries with duration heuristics to produce chunks optimized for downstream processing by downstream consumers.

Implementations§

Source§

impl Chunker

Source

pub fn new(config: ChunkerConfig) -> Self

Create a new chunker with the given configuration.

Source

pub fn default() -> Self

Create a chunker with default configuration (500ms chunks).

Alias for Chunker::new(ChunkerConfig::default()).

Source

pub fn chunk( &self, audio: &[f32], sample_rate: u32, vad_segments: &[SpeechChunk], ) -> Result<Vec<ProcessedChunk>>

Segment audio into processing chunks aligned to VAD boundaries.

This variant assumes that VAD timestamps are relative to the Unix epoch (e.g., tests that build times off AudioTimestamp::EPOCH). For streaming scenarios where VAD emits wall-clock timestamps (AudioTimestamp::now()), prefer Chunker::chunk_with_stream_start so the chunker can normalize against the actual stream start.

§Arguments
  • audio: Raw PCM samples (f32, normalized to [-1.0, 1.0])
  • sample_rate: Audio sample rate in Hz (must be > 0)
  • vad_segments: Speech boundaries from VAD analysis
§Returns

Vector of ProcessedChunk covering the entire input duration with no gaps.

§Errors

Returns Error::InvalidInput if:

  • sample_rate is zero
  • audio is empty
  • VAD segments have invalid timestamps (end < start)
§Performance

Target: <15ms total processing time per chunk generated.

Source

pub fn chunk_with_stream_start( &self, audio: &[f32], sample_rate: u32, vad_segments: &[SpeechChunk], stream_start_time: AudioTimestamp, ) -> Result<Vec<ProcessedChunk>>

Segment audio into processing chunks with an explicit stream start time.

Use this variant when the VAD timestamps are absolute (e.g., wall-clock) rather than relative to the Unix epoch.

use speech_prep::{Chunker, ChunkerConfig, SpeechChunk};
use speech_prep::time::{AudioDuration, AudioTimestamp};

let chunker = Chunker::new(ChunkerConfig::streaming());
let stream_start = AudioTimestamp::EPOCH;

// VAD emits wall-clock timestamps relative to the live stream
let segments = vec![SpeechChunk {
    start_time:  stream_start,
    end_time:    stream_start.add_duration(AudioDuration::from_millis(240)),
    confidence:  0.92,
    avg_energy:  0.4,
    frame_count: 48,
}];

let audio = vec![0.0; 3840]; // 240ms @ 16kHz
let chunks = chunker.chunk_with_stream_start(&audio, 16_000, &segments, stream_start)?;
assert_eq!(chunks.len(), 1);

Trait Implementations§

Source§

impl Clone for Chunker

Source§

fn clone(&self) -> Chunker

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for Chunker

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Copy for Chunker

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<F, T> IntoSample<T> for F
where T: FromSample<F>,

Source§

fn into_sample(self) -> T

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more