herolib-ai 0.3.7

AI client with multi-provider support (Groq, OpenRouter, SambaNova) and automatic failover
Documentation

herolib-ai

AI client with multi-provider support (Groq, OpenRouter, SambaNova) and automatic failover.

Overview

This crate provides a unified AI client that supports multiple providers with:

  • Multi-provider support: Automatically tries providers in order of preference
  • OpenAI-compatible API: Works with any OpenAI-compatible endpoint
  • Automatic failover: Falls back to alternative providers on failure
  • Verification support: Retry with feedback until response passes validation
  • Model abstraction: Use our model names, mapped to provider-specific IDs

Installation

Add to your Cargo.toml:

[dependencies]
herolib-ai = "0.1.0"

Environment Variables

Set API keys using environment variables:

export GROQ_API_KEY="your-groq-key"
export OPENROUTER_API_KEY="your-openrouter-key"
export SAMBANOVA_API_KEY="your-sambanova-key"

Usage

Simple Chat

use herolib_ai::{AiClient, Model, PromptBuilderExt};

let client = AiClient::from_env();

let response = client
    .prompt()
    .model(Model::Llama3_3_70B)
    .system("You are a helpful coding assistant")
    .user("Write a hello world in Rust")
    .execute_content()
    .unwrap();

println!("{}", response);

With Verification

use herolib_ai::{AiClient, Model, PromptBuilderExt};

/// Verifies that the response is valid JSON.
/// Returns Ok(()) if valid, or Err with feedback for the AI to retry.
fn verify_json(content: &str) -> Result<(), String> {
    match serde_json::from_str::<serde_json::Value>(content) {
        Ok(_) => Ok(()),
        Err(e) => Err(format!("Invalid JSON: {}. Please output only valid JSON.", e)),
    }
}

let client = AiClient::from_env();

let response = client
    .prompt()
    .model(Model::Qwen2_5Coder32B)
    .system("You are a JSON generator. Only output valid JSON.")
    .user("Generate a JSON object with name and age fields")
    .verify(verify_json)
    .max_retries(3)
    .execute_verified()
    .unwrap();

Manual Provider Configuration

use herolib_ai::{AiClient, Provider, ProviderConfig, Model};

let client = AiClient::new()
    .with_provider(ProviderConfig::new(Provider::Groq, "your-api-key"))
    .with_provider(ProviderConfig::new(Provider::OpenRouter, "your-api-key"))
    .with_default_temperature(0.7)
    .with_default_max_tokens(2000);

Available Models

Model Description Providers
Llama3_3_70B Fast, capable model for general tasks Groq, SambaNova, OpenRouter
Llama3_1_70B Versatile model for various tasks Groq, SambaNova, OpenRouter
Llama3_1_8B Small, fast model for simple tasks Groq, SambaNova, OpenRouter
Qwen2_5Coder32B Specialized for code generation Groq, SambaNova, OpenRouter
DeepSeekCoderV2_5 Advanced coding model OpenRouter, SambaNova
DeepSeekV3 Latest DeepSeek model OpenRouter, SambaNova
Llama3_1_405B Largest Llama model for complex tasks SambaNova, OpenRouter
Mixtral8x7B Efficient mixture of experts model Groq, OpenRouter
Llama3_2_90BVision Multimodal model with vision Groq, OpenRouter
Llama3_2_11BVision Smaller vision model Groq, SambaNova, OpenRouter
NemotronNano30B NVIDIA MoE model with reasoning OpenRouter

Embedding Models

Embedding models convert text into vector representations for semantic search, RAG, and similarity tasks.

Model Description Dimensions Context Provider
TextEmbedding3Small OpenAI fast, efficient embedding 1536 8,191 OpenRouter
Qwen3Embedding8B Multilingual embedding model - 32,768 OpenRouter

Embedding Usage

use herolib_ai::{AiClient, EmbeddingModel};

let client = AiClient::from_env();

// Single text embedding
let response = client
    .embed(EmbeddingModel::Qwen3Embedding8B, "Hello, world!")
    .unwrap();

println!("Vector dimensions: {}", response.embedding().unwrap().len());

// Batch embedding
let texts = vec!["Hello".to_string(), "World".to_string()];
let response = client
    .embed_batch(EmbeddingModel::Qwen3Embedding8B, texts)
    .unwrap();

for embedding in response.embeddings() {
    println!("Embedding length: {}", embedding.len());
}

Transcription Models

Speech-to-text transcription using Whisper models via Groq's ultra-fast inference.

Model Description Speed Translation Provider
WhisperLargeV3Turbo Fast multilingual transcription 216x RT No Groq
WhisperLargeV3 High accuracy transcription 189x RT Yes Groq

Supported audio formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm (max 25MB).

Transcription Usage

use herolib_ai::{AiClient, TranscriptionModel, TranscriptionOptions};
use std::path::Path;

let client = AiClient::from_env();

// Simple transcription from file
let response = client
    .transcribe_file(TranscriptionModel::WhisperLargeV3Turbo, Path::new("audio.mp3"))
    .unwrap();

println!("Transcription: {}", response.text);

// With options (language hint, temperature)
let options = TranscriptionOptions::new()
    .with_language("en")
    .with_temperature(0.0);

let response = client
    .transcribe_file_with_options(
        TranscriptionModel::WhisperLargeV3Turbo,
        Path::new("audio.mp3"),
        options,
    )
    .unwrap();

// Verbose response with timestamps
let options = TranscriptionOptions::new().with_language("en");
let response = client
    .transcribe_bytes_verbose(
        TranscriptionModel::WhisperLargeV3,
        &audio_bytes,
        "audio.mp3",
        options,
    )
    .unwrap();

println!("Duration: {:?}s", response.duration);
for segment in response.segments.unwrap_or_default() {
    println!("[{:.2}s - {:.2}s] {}", segment.start, segment.end, segment.text);
}

Providers

Groq

  • Fast inference provider
  • API: https://api.groq.com/openai/v1/chat/completions
  • Env: GROQ_API_KEY

OpenRouter

  • Unified API for multiple models
  • API: https://openrouter.ai/api/v1/chat/completions
  • Env: OPENROUTER_API_KEY

SambaNova

  • High-performance AI inference
  • API: https://api.sambanova.ai/v1/chat/completions
  • Env: SAMBANOVA_API_KEY

Model Test Utility

The modeltest binary tests model availability across all configured providers.

What it does

  1. Queries provider model lists - Fetches available models from each provider's API
  2. Validates model mappings - Checks if our configured model IDs exist on each provider
  3. Tests each model - Sends a simple "whoami" query to verify the model works
  4. Generates a report - Shows success/failure status for each model on each provider

Running the test

# Build and run
cargo run --bin modeltest

# Or after building
./target/debug/modeltest

Example output

herolib-ai Model Test Utility
Testing model availability across all providers

Configured providers:
  - Groq
  - OpenRouter

======================================================================
Phase 1: Querying Provider Model Lists
======================================================================
Querying Groq... OK (42 models)
Querying OpenRouter... OK (256 models)

======================================================================
Phase 2: Validating Model Mappings
======================================================================

Llama 3.3 70B (Llama 3.3 70B - Fast, capable model for general tasks):
  Groq llama-3.3-70b-versatile -> OK
  OpenRouter meta-llama/llama-3.3-70b-instruct -> OK

======================================================================
Phase 3: Testing Models with 'whoami' Query
======================================================================

--------------------------------------------------
Testing: Llama 3.3 70B
--------------------------------------------------
  Groq (llama-3.3-70b-versatile)... OK (523ms)
    Response: I'm LLaMA, a large language model trained by Meta AI.

======================================================================
Final Report
======================================================================

Test Summary:
  Total tests: 20
  Successful:  18
  Failed:      2
  Success rate: 90.0%

All tests passed!

Building

./build.sh

Testing

./run.sh

License

Apache-2.0