loqa-voice-dsp 0.5.0

# loqa-voice-dsp Integration Guide

**Version:** 0.1.0
**Date:** 2025-11-07
**For:** VoiceFind Mobile App Team

---

## Overview

This guide provides complete instructions for integrating the `loqa-voice-dsp` library into the VoiceFind mobile application (iOS and Android).

## Package Contents

```
loqa-voice-dsp/
├── src/
│   ├── lib.rs              # Main library exports
│   ├── pitch.rs            # Pitch detection (YIN algorithm)
│   ├── formants.rs         # Formant extraction (LPC)
│   ├── fft.rs              # FFT utilities
│   ├── spectral.rs         # Spectral analysis
│   └── ffi/
│       ├── mod.rs          # FFI module
│       ├── ios.rs          # iOS C exports
│       └── android.rs      # Android JNI (opt-in)
├── benches/
│   └── dsp_benchmarks.rs   # Performance benchmarks
├── Cargo.toml              # Rust package manifest
├── README.md               # Quick start guide
└── INTEGRATION_GUIDE.md    # This file
```

---

## iOS Integration (Swift)

### Step 1: Build the Static Library

```bash
# From the loqa-voice-dsp directory
cargo build --release --target aarch64-apple-ios

# For simulator (Intel Macs)
cargo build --release --target x86_64-apple-ios

# For simulator (Apple Silicon)
cargo build --release --target aarch64-apple-ios-sim
```

The compiled libraries will be in:

- `target/aarch64-apple-ios/release/libloqa_voice_dsp.a` (device)
- `target/x86_64-apple-ios/release/libloqa_voice_dsp.a` (Intel simulator)
- `target/aarch64-apple-ios-sim/release/libloqa_voice_dsp.a` (M-series simulator)

### Step 2: Create XCFramework (Recommended)

```bash
# Create universal binary for simulators
lipo -create \
  target/x86_64-apple-ios/release/libloqa_voice_dsp.a \
  target/aarch64-apple-ios-sim/release/libloqa_voice_dsp.a \
  -output libloqa_voice_dsp_sim.a

# Create XCFramework
xcodebuild -create-xcframework \
  -library target/aarch64-apple-ios/release/libloqa_voice_dsp.a \
  -headers include/loqa_voice_dsp.h \
  -library libloqa_voice_dsp_sim.a \
  -headers include/loqa_voice_dsp.h \
  -output LoqaVoiceDSP.xcframework
```

### Step 3: Generate C Header

Create `include/loqa_voice_dsp.h`:

```c
#ifndef LOQA_VOICE_DSP_H
#define LOQA_VOICE_DSP_H

#include <stdint.h>
#include <stdbool.h>

// Pitch detection result
typedef struct {
    bool success;
    float frequency;
    float confidence;
    bool is_voiced;
} PitchResultFFI;

// Formant extraction result
typedef struct {
    bool success;
    float f1;
    float f2;
    float f3;
    float confidence;
} FormantResultFFI;

// FFT result
typedef struct {
    bool success;
    float* magnitudes_ptr;
    float* frequencies_ptr;
    size_t length;
    uint32_t sample_rate;
} FFTResultFFI;

// Spectral features result
typedef struct {
    bool success;
    float centroid;
    float tilt;
    float rolloff_95;
} SpectralFeaturesFFI;

// Function exports
PitchResultFFI loqa_detect_pitch(
    const float* audio_ptr,
    size_t audio_len,
    uint32_t sample_rate,
    float min_frequency,
    float max_frequency
);

FormantResultFFI loqa_extract_formants(
    const float* audio_ptr,
    size_t audio_len,
    uint32_t sample_rate,
    size_t lpc_order
);

FFTResultFFI loqa_compute_fft(
    const float* audio_ptr,
    size_t audio_len,
    uint32_t sample_rate,
    size_t fft_size
);

SpectralFeaturesFFI loqa_analyze_spectrum(
    const FFTResultFFI* fft_result
);

// Memory management
void loqa_free_fft_result(FFTResultFFI* result);

#endif // LOQA_VOICE_DSP_H
```

### Step 4: Create Swift Wrapper (Recommended)

```swift
import Foundation

/// Swift wrapper for loqa-voice-dsp
public class LoqaVoiceDSP {

    // MARK: - Pitch Detection

    public struct PitchResult {
        public let frequency: Float
        public let confidence: Float
        public let isVoiced: Bool
    }

    public static func detectPitch(
        samples: [Float],
        sampleRate: Int = 16000,
        minFrequency: Float = 80.0,
        maxFrequency: Float = 400.0
    ) -> PitchResult? {
        let result = samples.withUnsafeBufferPointer { buffer in
            loqa_detect_pitch(
                buffer.baseAddress,
                buffer.count,
                UInt32(sampleRate),
                minFrequency,
                maxFrequency
            )
        }

        guard result.success else { return nil }

        return PitchResult(
            frequency: result.frequency,
            confidence: result.confidence,
            isVoiced: result.is_voiced
        )
    }

    // MARK: - Formant Extraction

    public struct FormantResult {
        public let f1: Float
        public let f2: Float
        public let f3: Float
        public let confidence: Float
    }

    public static func extractFormants(
        samples: [Float],
        sampleRate: Int = 16000,
        lpcOrder: Int = 14
    ) -> FormantResult? {
        let result = samples.withUnsafeBufferPointer { buffer in
            loqa_extract_formants(
                buffer.baseAddress,
                buffer.count,
                UInt32(sampleRate),
                lpcOrder
            )
        }

        guard result.success else { return nil }

        return FormantResult(
            f1: result.f1,
            f2: result.f2,
            f3: result.f3,
            confidence: result.confidence
        )
    }

    // MARK: - FFT & Spectral Analysis

    public struct SpectralFeatures {
        public let centroid: Float
        public let tilt: Float
        public let rolloff95: Float
    }

    public static func analyzeSpectrum(
        samples: [Float],
        sampleRate: Int = 16000,
        fftSize: Int = 2048
    ) -> SpectralFeatures? {
        // Step 1: Compute FFT
        var fftResult = samples.withUnsafeBufferPointer { buffer in
            loqa_compute_fft(
                buffer.baseAddress,
                buffer.count,
                UInt32(sampleRate),
                fftSize
            )
        }

        guard fftResult.success else { return nil }

        // IMPORTANT: Always free FFT result when done
        defer { loqa_free_fft_result(&fftResult) }

        // Step 2: Analyze spectrum using FFT result
        // CORRECT: Pass pointer to FFTResultFFI struct
        let spectralResult = loqa_analyze_spectrum(&fftResult)

        guard spectralResult.success else { return nil }

        return SpectralFeatures(
            centroid: spectralResult.centroid,
            tilt: spectralResult.tilt,
            rolloff95: spectralResult.rolloff_95
        )
    }
}
```

### Step 5: Xcode Project Setup

1. **Add XCFramework to Xcode:**

   - Drag `LoqaVoiceDSP.xcframework` into your project
   - In Build Settings → Framework Search Paths, add the framework location

2. **Configure Bridging Header:**

   - Create `VoiceFind-Bridging-Header.h`
   - Add: `#import "loqa_voice_dsp.h"`

3. **Link Binary:**
   - Go to Build Phases → Link Binary With Libraries
   - Add `LoqaVoiceDSP.xcframework`

### Step 6: Usage Example (Swift)

```swift
import AVFoundation

// Example: Analyze voice sample
func analyzeVoice(audioBuffer: AVAudioPCMBuffer) {
    guard let floatChannelData = audioBuffer.floatChannelData else { return }
    let samples = Array(UnsafeBufferPointer(start: floatChannelData[0],
                                             count: Int(audioBuffer.frameLength)))

    // Detect pitch
    if let pitch = LoqaVoiceDSP.detectPitch(samples: samples) {
        print("Pitch: \(pitch.frequency) Hz, Confidence: \(pitch.confidence)")
    }

    // Extract formants
    if let formants = LoqaVoiceDSP.extractFormants(samples: samples) {
        print("F1: \(formants.f1) Hz, F2: \(formants.f2) Hz")
    }

    // Spectral analysis
    if let spectral = LoqaVoiceDSP.analyzeSpectrum(samples: samples) {
        print("Centroid: \(spectral.centroid) Hz")
    }
}
```

---

## Android Integration (Java/Kotlin)

**Note:** Android JNI support is currently **opt-in** and tracked in the backlog. For MVP (iOS-first launch), Android integration can be completed later.

### When Android Support is Needed:

1. **Enable JNI Feature:**

   ```bash
   cargo build --release --target aarch64-linux-android --features android-jni
   ```

2. **Add JNI Dependency:**

   - Update `Cargo.toml` to make `jni = "0.21"` a default dependency
   - Remove `#[cfg(feature = "jni")]` gates from `src/ffi/android.rs`

3. **Build for Android Targets:**

   ```bash
   # ARM64 (most devices)
   cargo build --release --target aarch64-linux-android

   # ARMv7 (older devices)
   cargo build --release --target armv7-linux-androideabi

   # x86_64 (emulator)
   cargo build --release --target x86_64-linux-android
   ```

4. **See `src/ffi/android.rs` for Java bridge code examples**

**Current Status:** Android JNI deferred to backlog (see `docs/backlog.md` in Loqa repository)

---

## Performance Characteristics

### Validated Benchmarks (Apple M-series Silicon)

| Operation                  | Latency | Real-time Capability            |
| -------------------------- | ------- | ------------------------------- |
| Pitch detection (100ms)    | 0.125ms | ✅ 160x faster than 20ms target |
| Formant extraction (500ms) | 0.134ms | ✅ 373x faster than 50ms target |
| FFT (2048 points)          | 0.020ms | ✅ 500x faster than 10ms target |
| Spectral analysis          | 0.003ms | ✅ 1667x faster than 5ms target |

**Conclusion:** All operations easily meet real-time requirements for voice training applications.

---

## Audio Format Requirements

### Input Format

- **Sample Type:** `f32` (32-bit floating point)
- **Range:** -1.0 to 1.0 (normalized)
- **Sample Rate:** 16000 Hz recommended (8000-48000 Hz supported)
- **Channels:** Mono (single channel)

### Recommended Audio Windows

- **Pitch Detection:** 100ms minimum (1600 samples at 16kHz)
- **Formant Extraction:** 500ms recommended (8000 samples at 16kHz)
- **FFT Analysis:** Power-of-2 sizes (512, 1024, 2048, 4096)

### Converting iOS AVAudioPCMBuffer to f32

```swift
extension AVAudioPCMBuffer {
    func toFloatArray() -> [Float]? {
        guard let floatData = floatChannelData else { return nil }
        return Array(UnsafeBufferPointer(start: floatData[0],
                                          count: Int(frameLength)))
    }
}
```

---

## API Reference

### Pitch Detection

```c
PitchResultFFI loqa_detect_pitch(
    const float* audio_ptr,      // Audio samples (-1.0 to 1.0)
    size_t audio_len,             // Number of samples (min 100)
    uint32_t sample_rate,         // Sample rate in Hz
    float min_frequency,          // Min expected pitch (e.g., 80.0 Hz)
    float max_frequency           // Max expected pitch (e.g., 400.0 Hz)
);
```

**Returns:**

- `success`: true if detection succeeded
- `frequency`: Detected pitch in Hz (0.0 if unvoiced)
- `confidence`: Confidence score 0.0-1.0
- `is_voiced`: true if periodic signal detected

**Recommended Ranges:**

- Male voice: 80-250 Hz
- Female voice: 165-400 Hz
- Extended range: 80-800 Hz

### Formant Extraction

```c
FormantResultFFI loqa_extract_formants(
    const float* audio_ptr,      // Audio samples
    size_t audio_len,             // Number of samples (min 256)
    uint32_t sample_rate,         // Sample rate in Hz
    size_t lpc_order              // LPC order (12-16 recommended)
);
```

**Returns:**

- `f1`: First formant frequency (Hz)
- `f2`: Second formant frequency (Hz)
- `f3`: Third formant frequency (Hz)
- `confidence`: 0.0-1.0 based on formant validity

**LPC Order Guidelines:**

- 12: Fast, lower accuracy
- 14: Balanced (recommended)
- 16: High accuracy, slower

---

## Troubleshooting

### Common FFI Issues

**Error:** Stack overflow crash (SIGBUS) in `loqa_analyze_spectrum`

- **Symptom:** App crashes with `EXC_BAD_ACCESS (SIGBUS)` when calling spectral analysis
- **Cause:** API signature mismatch between Swift and Rust
- **Solution:** Ensure you're using version 0.1.1+ which has the corrected FFI signature
- **Correct Usage:**

  ```swift
  // CORRECT (v0.1.1+)
  var fftResult = loqa_compute_fft(...)
  defer { loqa_free_fft_result(&fftResult) }

  // Pass pointer to FFTResultFFI struct
  let spectralFeatures = loqa_analyze_spectrum(&fftResult)

  if spectralFeatures.success {
      print("Centroid: \(spectralFeatures.centroid) Hz")
  }
  ```

**Best Practice:** Memory Management

- Always use `defer` to free FFT results immediately after computing them
- Never store or pass FFT result pointers beyond their scope
- Ensure FFT computation succeeds before analyzing spectrum

### iOS Build Errors

**Error:** `library not found for -lloqa_voice_dsp`

- **Solution:** Ensure the static library is in your Framework Search Paths
- **Check:** Build Settings → Framework Search Paths → Add library location

**Error:** `Undefined symbols for architecture arm64`

- **Solution:** Verify you built for the correct target (aarch64-apple-ios)
- **Check:** Run `file target/aarch64-apple-ios/release/libloqa_voice_dsp.a`

### Performance Issues

**Symptom:** Slower than expected performance

- **Check 1:** Ensure you're using `--release` builds (not debug)
- **Check 2:** Verify sample rate isn't excessively high (16kHz recommended)
- **Check 3:** Reduce FFT size for faster processing (512 vs 4096)

### Audio Quality Issues

**Symptom:** Poor pitch detection accuracy

- **Check 1:** Audio is normalized (-1.0 to 1.0 range)
- **Check 2:** Window size is sufficient (≥100ms for pitch)
- **Check 3:** Sample rate matches expected rate (16kHz default)

---

## Support & Contact

For integration questions or issues:

- **GitHub Issues:** [loqa repository](https://github.com/loqalabs/loqa)
- **Technical Contact:** Anna (Loqa Team)
- **Documentation:** See `README.md` in crate root

---

## Version History

- **0.1.0** (2025-11-07): Initial release
  - Pitch detection (YIN algorithm)
  - Formant extraction (LPC)
  - FFT utilities
  - Spectral analysis
  - iOS FFI layer (complete)
  - Android JNI layer (deferred to backlog)
  - Comprehensive benchmarks
  - 35 unit tests passing

---

## License

MIT License - See LICENSE file in crate root