Skip to main content

SttEngine

Struct SttEngine 

Source
pub struct SttEngine { /* private fields */ }
Expand description

Speech-to-text engine optimized for speed and ease of use.

This is the main entry point for transcription. Create an engine, warm it up, and start transcribing audio samples.

§Example

use memo_stt::SttEngine;

// Create engine with default model
let mut engine = SttEngine::new_default(16000)?;

// Warm up GPU (reduces first-transcription latency)
engine.warmup()?;

// Transcribe audio samples (16kHz, mono, i16 PCM)
let samples: Vec<i16> = vec![]; // Your audio data here
let text = engine.transcribe(&samples)?;
println!("Transcribed: {}", text);

§Performance

  • First transcription: ~500ms-1s (after warmup)
  • Subsequent transcriptions: ~200-500ms
  • GPU acceleration is automatic on supported platforms

Implementations§

Source§

impl SttEngine

Source

pub fn new_default(input_sample_rate: u32) -> Result<Self>

Create a new engine with the default model.

The model will be automatically downloaded to the cache directory on first use. For custom model paths, use new.

§Arguments
  • input_sample_rate - Sample rate of input audio (e.g., 16000, 48000)
§Example
use memo_stt::SttEngine;
let engine = SttEngine::new_default(16000)?;
Examples found in repository?
examples/gui_integration.rs (line 21)
20fn setup_realtime_transcription() -> Result<SttEngine, Box<dyn std::error::Error>> {
21    let engine = SttEngine::new_default(16000)?;
22    engine.warmup()?;
23    Ok(engine)
24}
More examples
Hide additional examples
examples/custom_vocabulary.rs (line 14)
13fn main() -> Result<(), Box<dyn std::error::Error>> {
14    let mut engine = SttEngine::new_default(16000)?;
15
16    engine.set_prompt(Some(
17        "Rust programming language, cargo, crates.io, GitHub, \
18         async await, tokio, serde, clippy, rustfmt"
19            .to_string(),
20    ));
21
22    engine.warmup()?;
23
24    println!("Engine ready with custom vocabulary.");
25    println!("Pass audio samples to engine.transcribe(&samples) to use it.");
26
27    Ok(())
28}
examples/basic.rs (line 15)
13fn main() -> Result<(), Box<dyn std::error::Error>> {
14    println!("Creating STT engine (this may download the model on first run)...");
15    let mut engine = SttEngine::new_default(16000)?;
16
17    println!("Warming up...");
18    engine.warmup()?;
19
20    println!("Engine ready.");
21    println!();
22    println!("To transcribe, pass 16-bit mono PCM samples:");
23    println!("    let samples: Vec<i16> = /* your audio */;");
24    println!("    let text = engine.transcribe(&samples)?;");
25
26    // Example with one second of silence (just to demonstrate the call shape).
27    let samples = vec![0i16; 16_000];
28    match engine.transcribe(&samples) {
29        Ok(text) => println!("Transcribed (silence): {:?}", text),
30        Err(e) => println!("Transcribe error: {}", e),
31    }
32
33    Ok(())
34}
Source

pub fn new(model_path: impl AsRef<Path>, input_sample_rate: u32) -> Result<Self>

Create a new engine with a custom model path.

If the model doesn’t exist, it will attempt to download it automatically (if it’s a known model name). Otherwise, you’ll need to provide the full path to an existing model file.

§Arguments
  • model_path - Path to a GGML speech model, or model name
  • input_sample_rate - Sample rate of input audio (e.g., 16000, 48000)
§Example
use memo_stt::SttEngine;
// Use default model (auto-downloads if needed)
let engine = SttEngine::new_default(16000)?;

// Or specify a custom path
let engine = SttEngine::new("models/ggml-small.en-q5_1.bin", 16000)?;
  • ggml-small.en-q5_1.bin (~500MB) - Best balance of speed and accuracy
  • ggml-distil-large-v3-q5_1.bin (~500MB) - Higher accuracy
  • ggml-distil-large-v3-q8_0.bin (~800MB) - Highest accuracy

Models are downloaded from: https://huggingface.co/ggerganov/whisper.cpp

Source

pub fn transcribe(&mut self, samples: &[i16]) -> Result<String>

Transcribe audio samples to text.

Takes PCM audio samples (16-bit signed integers) and returns transcribed text.

§Arguments
  • samples - Audio samples as i16 PCM data at the sample rate specified when creating the engine
§Returns

Transcribed text as a String. Returns empty string if no speech detected.

§Example
use memo_stt::SttEngine;

let mut engine = SttEngine::new_default(16000)?;
engine.warmup()?;

// Your audio samples (16kHz, mono, i16 PCM)
let samples: Vec<i16> = vec![]; // Replace with actual audio
let text = engine.transcribe(&samples)?;
println!("{}", text);
§Audio Format Requirements
  • Format: 16-bit signed integer PCM (i16)
  • Channels: Mono
  • Sample rate: Must match the input_sample_rate provided to new() or new_default()
  • Minimum length: 1 second (16000 samples at 16kHz)
Examples found in repository?
examples/gui_integration.rs (line 16)
12fn handle_record_button_click(
13    engine: &mut SttEngine,
14) -> Result<String, Box<dyn std::error::Error>> {
15    let samples = capture_audio()?;
16    let text = engine.transcribe(&samples)?;
17    Ok(text)
18}
More examples
Hide additional examples
examples/basic.rs (line 28)
13fn main() -> Result<(), Box<dyn std::error::Error>> {
14    println!("Creating STT engine (this may download the model on first run)...");
15    let mut engine = SttEngine::new_default(16000)?;
16
17    println!("Warming up...");
18    engine.warmup()?;
19
20    println!("Engine ready.");
21    println!();
22    println!("To transcribe, pass 16-bit mono PCM samples:");
23    println!("    let samples: Vec<i16> = /* your audio */;");
24    println!("    let text = engine.transcribe(&samples)?;");
25
26    // Example with one second of silence (just to demonstrate the call shape).
27    let samples = vec![0i16; 16_000];
28    match engine.transcribe(&samples) {
29        Ok(text) => println!("Transcribed (silence): {:?}", text),
30        Err(e) => println!("Transcribe error: {}", e),
31    }
32
33    Ok(())
34}
Source

pub fn set_prompt(&mut self, prompt: Option<String>)

Set initial prompt for custom vocabulary or context.

Useful for improving accuracy with domain-specific terms, names, or technical vocabulary.

§Example
use memo_stt::SttEngine;

let mut engine = SttEngine::new_default(16000)?;
engine.set_prompt(Some("Rust programming language, cargo, crates.io".to_string()));
Examples found in repository?
examples/custom_vocabulary.rs (lines 16-20)
13fn main() -> Result<(), Box<dyn std::error::Error>> {
14    let mut engine = SttEngine::new_default(16000)?;
15
16    engine.set_prompt(Some(
17        "Rust programming language, cargo, crates.io, GitHub, \
18         async await, tokio, serde, clippy, rustfmt"
19            .to_string(),
20    ));
21
22    engine.warmup()?;
23
24    println!("Engine ready with custom vocabulary.");
25    println!("Pass audio samples to engine.transcribe(&samples) to use it.");
26
27    Ok(())
28}
Source

pub fn warmup(&self) -> Result<()>

Warm up the GPU to reduce first-transcription latency.

Call this after creating the engine to pre-initialize GPU resources. The first transcription after warmup will be faster.

§Example
use memo_stt::SttEngine;

let mut engine = SttEngine::new_default(16000)?;
engine.warmup()?; // Pre-initialize GPU
// Now transcriptions will be faster
Examples found in repository?
examples/gui_integration.rs (line 22)
20fn setup_realtime_transcription() -> Result<SttEngine, Box<dyn std::error::Error>> {
21    let engine = SttEngine::new_default(16000)?;
22    engine.warmup()?;
23    Ok(engine)
24}
More examples
Hide additional examples
examples/custom_vocabulary.rs (line 22)
13fn main() -> Result<(), Box<dyn std::error::Error>> {
14    let mut engine = SttEngine::new_default(16000)?;
15
16    engine.set_prompt(Some(
17        "Rust programming language, cargo, crates.io, GitHub, \
18         async await, tokio, serde, clippy, rustfmt"
19            .to_string(),
20    ));
21
22    engine.warmup()?;
23
24    println!("Engine ready with custom vocabulary.");
25    println!("Pass audio samples to engine.transcribe(&samples) to use it.");
26
27    Ok(())
28}
examples/basic.rs (line 18)
13fn main() -> Result<(), Box<dyn std::error::Error>> {
14    println!("Creating STT engine (this may download the model on first run)...");
15    let mut engine = SttEngine::new_default(16000)?;
16
17    println!("Warming up...");
18    engine.warmup()?;
19
20    println!("Engine ready.");
21    println!();
22    println!("To transcribe, pass 16-bit mono PCM samples:");
23    println!("    let samples: Vec<i16> = /* your audio */;");
24    println!("    let text = engine.transcribe(&samples)?;");
25
26    // Example with one second of silence (just to demonstrate the call shape).
27    let samples = vec![0i16; 16_000];
28    match engine.transcribe(&samples) {
29        Ok(text) => println!("Transcribed (silence): {:?}", text),
30        Err(e) => println!("Transcribe error: {}", e),
31    }
32
33    Ok(())
34}

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.