pub struct LocalVadDetector { /* private fields */ }Expand description
Local voice activity detector.
Provides voice activity detection using local processing without
external API calls. Uses the voice_activity_detector crate for
speech detection and analysis.
Implementations§
Source§impl LocalVadDetector
impl LocalVadDetector
Sourcepub async fn detect_speech(&self, audio_path: &Path) -> Result<VadResult>
pub async fn detect_speech(&self, audio_path: &Path) -> Result<VadResult>
Detect speech activity in an audio file.
Processes the entire audio file to identify speech segments with timestamps and confidence scores.
§Arguments
audio_path- Path to the audio file to analyze
§Returns
VAD analysis results including speech segments and metadata
§Errors
Returns an error if:
- Audio file cannot be loaded
- VAD processing fails
- Audio format is unsupported
Sourcepub fn calculate_chunk_size(&self, sample_rate: u32) -> usize
pub fn calculate_chunk_size(&self, sample_rate: u32) -> usize
Dynamically calculates the optimal VAD chunk size for a given audio sample rate.
This function selects a chunk size (in samples) that is compatible with the VAD model’s requirements
and recommended for common sample rates. For 8000 Hz and 16000 Hz, it uses 512 samples by default,
which is within the recommended range (512, 768, or 1024). For other sample rates, it uses a 30 ms
window as the baseline, with a minimum of 1024 samples. The function also ensures that the chunk size
always satisfies the model’s constraint: sample_rate <= 31.25 * chunk_size.
§Arguments
sample_rate: The audio sample rate in Hz (e.g., 16000 for 16kHz audio)
§Returns
The chunk size in number of samples, selected for optimal model compatibility.
§Examples
Basic usage:
use subx_cli::services::vad::LocalVadDetector;
let detector = LocalVadDetector::new(Default::default()).unwrap();
let chunk_size = detector.calculate_chunk_size(16000);
assert_eq!(chunk_size, 512);§Model Constraint
The returned chunk size always satisfies: sample_rate <= 31.25 * chunk_size.