# loqa-voice-dsp Integration Guide
**Version:** 0.1.0
**Date:** 2025-11-07
**For:** VoiceFind Mobile App Team
---
## Overview
This guide provides complete instructions for integrating the `loqa-voice-dsp` library into the VoiceFind mobile application (iOS and Android).
## Package Contents
```
loqa-voice-dsp/
├── src/
│ ├── lib.rs # Main library exports
│ ├── pitch.rs # Pitch detection (YIN algorithm)
│ ├── formants.rs # Formant extraction (LPC)
│ ├── fft.rs # FFT utilities
│ ├── spectral.rs # Spectral analysis
│ └── ffi/
│ ├── mod.rs # FFI module
│ ├── ios.rs # iOS C exports
│ └── android.rs # Android JNI (opt-in)
├── benches/
│ └── dsp_benchmarks.rs # Performance benchmarks
├── Cargo.toml # Rust package manifest
├── README.md # Quick start guide
└── INTEGRATION_GUIDE.md # This file
```
---
## iOS Integration (Swift)
### Step 1: Build the Static Library
```bash
# From the loqa-voice-dsp directory
cargo build --release --target aarch64-apple-ios
# For simulator (Intel Macs)
cargo build --release --target x86_64-apple-ios
# For simulator (Apple Silicon)
cargo build --release --target aarch64-apple-ios-sim
```
The compiled libraries will be in:
- `target/aarch64-apple-ios/release/libloqa_voice_dsp.a` (device)
- `target/x86_64-apple-ios/release/libloqa_voice_dsp.a` (Intel simulator)
- `target/aarch64-apple-ios-sim/release/libloqa_voice_dsp.a` (M-series simulator)
### Step 2: Create XCFramework (Recommended)
```bash
# Create universal binary for simulators
lipo -create \
target/x86_64-apple-ios/release/libloqa_voice_dsp.a \
target/aarch64-apple-ios-sim/release/libloqa_voice_dsp.a \
-output libloqa_voice_dsp_sim.a
# Create XCFramework
xcodebuild -create-xcframework \
-library target/aarch64-apple-ios/release/libloqa_voice_dsp.a \
-headers include/loqa_voice_dsp.h \
-library libloqa_voice_dsp_sim.a \
-headers include/loqa_voice_dsp.h \
-output LoqaVoiceDSP.xcframework
```
### Step 3: Generate C Header
Create `include/loqa_voice_dsp.h`:
```c
#ifndef LOQA_VOICE_DSP_H
#define LOQA_VOICE_DSP_H
#include <stdint.h>
#include <stdbool.h>
// Pitch detection result
typedef struct {
bool success;
float frequency;
float confidence;
bool is_voiced;
} PitchResultFFI;
// Formant extraction result
typedef struct {
bool success;
float f1;
float f2;
float f3;
float confidence;
} FormantResultFFI;
// FFT result
typedef struct {
bool success;
float* magnitudes_ptr;
float* frequencies_ptr;
size_t length;
uint32_t sample_rate;
} FFTResultFFI;
// Spectral features result
typedef struct {
bool success;
float centroid;
float tilt;
float rolloff_95;
} SpectralFeaturesFFI;
// Function exports
PitchResultFFI loqa_detect_pitch(
const float* audio_ptr,
size_t audio_len,
uint32_t sample_rate,
float min_frequency,
float max_frequency
);
FormantResultFFI loqa_extract_formants(
const float* audio_ptr,
size_t audio_len,
uint32_t sample_rate,
size_t lpc_order
);
FFTResultFFI loqa_compute_fft(
const float* audio_ptr,
size_t audio_len,
uint32_t sample_rate,
size_t fft_size
);
SpectralFeaturesFFI loqa_analyze_spectrum(
const FFTResultFFI* fft_result
);
// Memory management
void loqa_free_fft_result(FFTResultFFI* result);
#endif // LOQA_VOICE_DSP_H
```
### Step 4: Create Swift Wrapper (Recommended)
```swift
import Foundation
/// Swift wrapper for loqa-voice-dsp
public class LoqaVoiceDSP {
// MARK: - Pitch Detection
public struct PitchResult {
public let frequency: Float
public let confidence: Float
public let isVoiced: Bool
}
public static func detectPitch(
samples: [Float],
sampleRate: Int = 16000,
minFrequency: Float = 80.0,
maxFrequency: Float = 400.0
) -> PitchResult? {
let result = samples.withUnsafeBufferPointer { buffer in
loqa_detect_pitch(
buffer.baseAddress,
buffer.count,
UInt32(sampleRate),
minFrequency,
maxFrequency
)
}
guard result.success else { return nil }
return PitchResult(
frequency: result.frequency,
confidence: result.confidence,
isVoiced: result.is_voiced
)
}
// MARK: - Formant Extraction
public struct FormantResult {
public let f1: Float
public let f2: Float
public let f3: Float
public let confidence: Float
}
public static func extractFormants(
samples: [Float],
sampleRate: Int = 16000,
lpcOrder: Int = 14
) -> FormantResult? {
let result = samples.withUnsafeBufferPointer { buffer in
loqa_extract_formants(
buffer.baseAddress,
buffer.count,
UInt32(sampleRate),
lpcOrder
)
}
guard result.success else { return nil }
return FormantResult(
f1: result.f1,
f2: result.f2,
f3: result.f3,
confidence: result.confidence
)
}
// MARK: - FFT & Spectral Analysis
public struct SpectralFeatures {
public let centroid: Float
public let tilt: Float
public let rolloff95: Float
}
public static func analyzeSpectrum(
samples: [Float],
sampleRate: Int = 16000,
fftSize: Int = 2048
) -> SpectralFeatures? {
// Step 1: Compute FFT
var fftResult = samples.withUnsafeBufferPointer { buffer in
loqa_compute_fft(
buffer.baseAddress,
buffer.count,
UInt32(sampleRate),
fftSize
)
}
guard fftResult.success else { return nil }
// IMPORTANT: Always free FFT result when done
defer { loqa_free_fft_result(&fftResult) }
// Step 2: Analyze spectrum using FFT result
// CORRECT: Pass pointer to FFTResultFFI struct
let spectralResult = loqa_analyze_spectrum(&fftResult)
guard spectralResult.success else { return nil }
return SpectralFeatures(
centroid: spectralResult.centroid,
tilt: spectralResult.tilt,
rolloff95: spectralResult.rolloff_95
)
}
}
```
### Step 5: Xcode Project Setup
1. **Add XCFramework to Xcode:**
- Drag `LoqaVoiceDSP.xcframework` into your project
- In Build Settings → Framework Search Paths, add the framework location
2. **Configure Bridging Header:**
- Create `VoiceFind-Bridging-Header.h`
- Add: `#import "loqa_voice_dsp.h"`
3. **Link Binary:**
- Go to Build Phases → Link Binary With Libraries
- Add `LoqaVoiceDSP.xcframework`
### Step 6: Usage Example (Swift)
```swift
import AVFoundation
// Example: Analyze voice sample
func analyzeVoice(audioBuffer: AVAudioPCMBuffer) {
guard let floatChannelData = audioBuffer.floatChannelData else { return }
let samples = Array(UnsafeBufferPointer(start: floatChannelData[0],
count: Int(audioBuffer.frameLength)))
// Detect pitch
if let pitch = LoqaVoiceDSP.detectPitch(samples: samples) {
print("Pitch: \(pitch.frequency) Hz, Confidence: \(pitch.confidence)")
}
// Extract formants
if let formants = LoqaVoiceDSP.extractFormants(samples: samples) {
print("F1: \(formants.f1) Hz, F2: \(formants.f2) Hz")
}
// Spectral analysis
if let spectral = LoqaVoiceDSP.analyzeSpectrum(samples: samples) {
print("Centroid: \(spectral.centroid) Hz")
}
}
```
---
## Android Integration (Java/Kotlin)
**Note:** Android JNI support is currently **opt-in** and tracked in the backlog. For MVP (iOS-first launch), Android integration can be completed later.
### When Android Support is Needed:
1. **Enable JNI Feature:**
```bash
cargo build --release --target aarch64-linux-android --features android-jni
```
2. **Add JNI Dependency:**
- Update `Cargo.toml` to make `jni = "0.21"` a default dependency
- Remove `#[cfg(feature = "jni")]` gates from `src/ffi/android.rs`
3. **Build for Android Targets:**
```bash
cargo build --release --target aarch64-linux-android
cargo build --release --target armv7-linux-androideabi
cargo build --release --target x86_64-linux-android
```
4. **See `src/ffi/android.rs` for Java bridge code examples**
**Current Status:** Android JNI deferred to backlog (see `docs/backlog.md` in Loqa repository)
---
## Performance Characteristics
### Validated Benchmarks (Apple M-series Silicon)
| Pitch detection (100ms) | 0.125ms | ✅ 160x faster than 20ms target |
| Formant extraction (500ms) | 0.134ms | ✅ 373x faster than 50ms target |
| FFT (2048 points) | 0.020ms | ✅ 500x faster than 10ms target |
| Spectral analysis | 0.003ms | ✅ 1667x faster than 5ms target |
**Conclusion:** All operations easily meet real-time requirements for voice training applications.
---
## Audio Format Requirements
### Input Format
- **Sample Type:** `f32` (32-bit floating point)
- **Range:** -1.0 to 1.0 (normalized)
- **Sample Rate:** 16000 Hz recommended (8000-48000 Hz supported)
- **Channels:** Mono (single channel)
### Recommended Audio Windows
- **Pitch Detection:** 100ms minimum (1600 samples at 16kHz)
- **Formant Extraction:** 500ms recommended (8000 samples at 16kHz)
- **FFT Analysis:** Power-of-2 sizes (512, 1024, 2048, 4096)
### Converting iOS AVAudioPCMBuffer to f32
```swift
extension AVAudioPCMBuffer {
func toFloatArray() -> [Float]? {
guard let floatData = floatChannelData else { return nil }
return Array(UnsafeBufferPointer(start: floatData[0],
count: Int(frameLength)))
}
}
```
---
## API Reference
### Pitch Detection
```c
PitchResultFFI loqa_detect_pitch(
const float* audio_ptr, // Audio samples (-1.0 to 1.0)
size_t audio_len, // Number of samples (min 100)
uint32_t sample_rate, // Sample rate in Hz
float min_frequency, // Min expected pitch (e.g., 80.0 Hz)
float max_frequency // Max expected pitch (e.g., 400.0 Hz)
);
```
**Returns:**
- `success`: true if detection succeeded
- `frequency`: Detected pitch in Hz (0.0 if unvoiced)
- `confidence`: Confidence score 0.0-1.0
- `is_voiced`: true if periodic signal detected
**Recommended Ranges:**
- Male voice: 80-250 Hz
- Female voice: 165-400 Hz
- Extended range: 80-800 Hz
### Formant Extraction
```c
FormantResultFFI loqa_extract_formants(
const float* audio_ptr, // Audio samples
size_t audio_len, // Number of samples (min 256)
uint32_t sample_rate, // Sample rate in Hz
size_t lpc_order // LPC order (12-16 recommended)
);
```
**Returns:**
- `f1`: First formant frequency (Hz)
- `f2`: Second formant frequency (Hz)
- `f3`: Third formant frequency (Hz)
- `confidence`: 0.0-1.0 based on formant validity
**LPC Order Guidelines:**
- 12: Fast, lower accuracy
- 14: Balanced (recommended)
- 16: High accuracy, slower
---
## Troubleshooting
### Common FFI Issues
**Error:** Stack overflow crash (SIGBUS) in `loqa_analyze_spectrum`
- **Symptom:** App crashes with `EXC_BAD_ACCESS (SIGBUS)` when calling spectral analysis
- **Cause:** API signature mismatch between Swift and Rust
- **Solution:** Ensure you're using version 0.1.1+ which has the corrected FFI signature
- **Correct Usage:**
```swift
// CORRECT (v0.1.1+)
var fftResult = loqa_compute_fft(...)
defer { loqa_free_fft_result(&fftResult) }
// Pass pointer to FFTResultFFI struct
let spectralFeatures = loqa_analyze_spectrum(&fftResult)
if spectralFeatures.success {
print("Centroid: \(spectralFeatures.centroid) Hz")
}
```
**Best Practice:** Memory Management
- Always use `defer` to free FFT results immediately after computing them
- Never store or pass FFT result pointers beyond their scope
- Ensure FFT computation succeeds before analyzing spectrum
### iOS Build Errors
**Error:** `library not found for -lloqa_voice_dsp`
- **Solution:** Ensure the static library is in your Framework Search Paths
- **Check:** Build Settings → Framework Search Paths → Add library location
**Error:** `Undefined symbols for architecture arm64`
- **Solution:** Verify you built for the correct target (aarch64-apple-ios)
- **Check:** Run `file target/aarch64-apple-ios/release/libloqa_voice_dsp.a`
### Performance Issues
**Symptom:** Slower than expected performance
- **Check 1:** Ensure you're using `--release` builds (not debug)
- **Check 2:** Verify sample rate isn't excessively high (16kHz recommended)
- **Check 3:** Reduce FFT size for faster processing (512 vs 4096)
### Audio Quality Issues
**Symptom:** Poor pitch detection accuracy
- **Check 1:** Audio is normalized (-1.0 to 1.0 range)
- **Check 2:** Window size is sufficient (≥100ms for pitch)
- **Check 3:** Sample rate matches expected rate (16kHz default)
---
## Support & Contact
For integration questions or issues:
- **GitHub Issues:** [loqa repository](https://github.com/loqalabs/loqa)
- **Technical Contact:** Anna (Loqa Team)
- **Documentation:** See `README.md` in crate root
---
## Version History
- **0.1.0** (2025-11-07): Initial release
- Pitch detection (YIN algorithm)
- Formant extraction (LPC)
- FFT utilities
- Spectral analysis
- iOS FFI layer (complete)
- Android JNI layer (deferred to backlog)
- Comprehensive benchmarks
- 35 unit tests passing
---
## License
MIT License - See LICENSE file in crate root