Expand description
Rust library for Doubao (ByteDance Volcengine) APIs.
This crate provides clients for various Doubao APIs:
- TTS: Text-to-Speech API
- ASR: Automatic Speech Recognition API
- Chat: Chat completion API (OpenAI-compatible)
- Embeddings: Text and multimodal embeddings API
- Images: Image generation API
- Tokenization: Text tokenization API
§Creating a client
use novel_doubao::Client;
use novel_doubao::config::DoubaoConfig;
// Create a client with default configuration from environment variables.
let client = Client::new();
// Or create with custom configuration.
let config = DoubaoConfig::new()
.with_app_id("your-app-id")
.with_api_key("your-api-key")
.with_access_token("your-access-token")
.with_resource_id("seed-tts-2.0");
let client = Client::with_config(config);§Making TTS requests
use novel_doubao::Client;
use novel_doubao::spec::tts::CreateSpeechRequestArgs;
let client = Client::new();
let request = CreateSpeechRequestArgs::default()
.text("Hello, world!")
.speaker("zh_female_cancan_mars_bigtts")
.sample_rate(24000u32)
.build()?;
let response = client.tts().speech().create(request).await?;
// Save to file
response.save("output.mp3").await?;
println!("Generated {} bytes of audio", response.bytes.len());§Making ASR requests
§Flash recognition (fastest, single request)
use novel_doubao::Client;
let client = Client::new();
// Recognize from URL
let result = client
.asr()
.recognition()
.flash_url("https://example.com/audio.wav", "user-id")
.await?;
println!("Recognized: {}", result.result.text);§Standard recognition (for long audio files)
use novel_doubao::Client;
let client = Client::new();
// Submit and wait for result
let result = client
.asr()
.recognition()
.recognize_url("https://example.com/audio.wav", "user-id")
.await?;
println!("Recognized: {}", result.result.text);§Streaming recognition (real-time)
use bytes::Bytes;
use novel_doubao::Client;
use novel_doubao::spec::asr::StreamingAsrConfigArgs;
let client = Client::new();
let config = StreamingAsrConfigArgs::default().rate(16000u32).build()?;
let mut session = client.asr().streaming().create_session(config).await?;
// Send audio data
session
.send_audio(Bytes::from_static(b"audio data..."))
.await?;
// Receive results
while let Some(result) = session.recv().await {
println!(
"Partial: {} (final: {})",
result.result.text, result.is_final
);
}§Environment Variables
The client reads these environment variables by default:
DOUBAO_APP_ID: Application IDDOUBAO_API_KEY: API keyDOUBAO_ACCESS_TOKEN: Access tokenDOUBAO_RESOURCE_ID: Resource ID (default: “seed-tts-2.0”)DOUBAO_HTTP_BASE: HTTP base URL (default: “https://ark.cn-beijing.volces.com/api/v3”)
§Features
tts: Enable TTS (Text-to-Speech) APIasr: Enable ASR (Automatic Speech Recognition) APIchat: Enable Chat completion APIembeddings: Enable Embeddings APIimages: Enable Image generation APItokenization: Enable Tokenization APIfull: Enable all featuresrustls: Use rustls for TLS (default)native-tls: Use native-tls for TLS
Modules§
- config
- Configuration for Doubao API client.
- error
- Error types for Doubao API client.
- spec
- Type definitions for Doubao APIs.
Structs§
- Asr
asr - ASR (Automatic Speech Recognition) API group.
- Chat
chat - Chat completions API.
- Client
- Doubao API client.
- Embeddings
embeddings - Embeddings API.
- Images
images - Images generation API.
- Recognition
asr - File-based speech recognition API.
- Speech
tts - Speech synthesis API.
- Streaming
asr - Streaming speech recognition API.
- Streaming
Session asr - A streaming recognition session.
- Tokenization
tokenization - Tokenization API.
- Tts
tts - TTS (Text-to-Speech) API group.