# wme-client
A robust, production-ready HTTP client for the [Wikimedia Enterprise API](https://enterprise.wikimedia.com/docs/).
## Features
- **Complete API Coverage**: Access all Wikimedia Enterprise endpoints including metadata, on-demand, snapshots, and realtime streaming
- **Authentication Management**: Automatic token refresh and secure credential handling
- **Resilient by Design**: Built-in retry logic with exponential backoff, jitter, and circuit breaker patterns
- **Streaming Support**: Efficiently handle large snapshot downloads and realtime SSE streams
- **Type-Safe**: Full Rust type definitions for all API responses
- **Async/Await**: Built on tokio for high-performance asynchronous operations
## Installation
Add this to your `Cargo.toml`:
```toml
[dependencies]
wme-client = "0.1.3"
tokio = { version = "1", features = ["full"] }
```
## Quick Start
```rust
use wme_client::WmeClient;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create a client with authentication
let client = WmeClient::builder()
.credentials("username", "password")
.build()
.await?;
// List available projects
let projects = client.metadata().list_projects().await?;
println!("Available projects: {:?}", projects);
Ok(())
}
```
## Authentication
The client supports username/password authentication with automatic token management:
```rust
let client = WmeClient::builder()
.credentials("your_username", "your_password")
.build()
.await?;
```
Tokens are automatically refreshed before expiration. You can also manually revoke tokens:
```rust
// Get the token manager
if let Some(token_manager) = client.token_manager() {
token_manager.revoke_token().await?;
}
```
## API Clients
### Metadata Client
Discover available projects, languages, and namespaces:
```rust
let metadata = client.metadata();
// List all projects
let projects = metadata.list_projects().await?;
// Get specific project info
let wikipedia = metadata.get_project("en.wikipedia").await?;
// List languages
let languages = metadata.list_languages().await?;
// List namespaces for a project
let namespaces = metadata.list_namespaces().await?;
```
### On-Demand Client
Fetch individual articles:
```rust
let on_demand = client.on_demand();
// Get a single article
let articles = on_demand.get_article("Rust (programming language)", None).await?;
// Get multiple articles efficiently
let articles = on_demand.get_articles(&["NASA", "SpaceX"], None).await?;
// Get structured article data (BETA)
let structured = on_demand.get_structured_article("Python (programming language)", None).await?;
```
### Snapshot Client
Download bulk data snapshots:
```rust
use futures::StreamExt;
let snapshot = client.snapshot();
// List available snapshots
let snapshots = snapshot.list_snapshots().await?;
// Get snapshot metadata
let info = snapshot.get_snapshot_info(&snapshot_id).await?;
// Download a snapshot as a stream
let mut stream = snapshot.download_snapshot(&snapshot_id, None).await?;
let mut data = Vec::new();
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
data.extend_from_slice(&chunk);
}
// Download specific chunks
let chunks = snapshot.list_chunks(&snapshot_id).await?;
let mut stream = snapshot.download_chunk(&snapshot_id, &chunk_id, None).await?;
```
### Realtime Client
Stream article updates in real-time:
```rust
use wme_client::RealtimeConnectOptions;
use chrono::Utc;
use futures::StreamExt;
let realtime = client.realtime();
// Connect to live stream
let options = RealtimeConnectOptions::since(Utc::now() - Duration::hours(1));
let mut stream = realtime.connect(&options, None).await?;
while let Some(result) = stream.next().await {
match result {
Ok(update) => println!("Updated: {}", update.article.name),
Err(e) => eprintln!("Error: {}", e),
}
}
```
#### Realtime Batches
For historical realtime data, use batches:
```rust
// List available batches
let batches = realtime.list_batches("2024-01-15", "12").await?;
// Stream a batch as parsed articles
let mut stream = realtime.stream_batch("2024-01-15", "12", "batch_001").await?;
while let Some(result) = stream.next().await {
match result {
Ok(article) => println!("Article: {}", article.name),
Err(e) => eprintln!("Parse error: {}", e),
}
}
```
## Configuration
### Retry Configuration
Customize retry behavior for your use case:
```rust
use wme_client::RetryConfig;
use std::time::Duration;
// Production-grade configuration
let retry = RetryConfig::production();
// Development configuration (faster retries)
let retry = RetryConfig::development();
// Batch processing (more retries, longer delays)
let retry = RetryConfig::batch_processing();
// Custom configuration
let retry = RetryConfig::new()
.with_max_retries(5)
.with_base_delay(Duration::from_secs(1))
.with_max_delay(Duration::from_secs(60))
.with_jitter(0.25);
let client = WmeClient::builder()
.credentials("username", "password")
.retry(retry)
.build()
.await?;
```
### Custom Base URLs
For testing or private deployments:
```rust
let client = WmeClient::builder()
.api_url("https://api.example.com")
.auth_url("https://auth.example.com")
.realtime_url("https://realtime.example.com")
.credentials("username", "password")
.build()
.await?;
```
### Timeout Configuration
```rust
let client = WmeClient::builder()
.credentials("username", "password")
.timeout(Duration::from_secs(120))
.build()
.await?;
```
### Disable Retry Logic
For debugging or when you want full control:
```rust
let client = WmeClient::builder()
.credentials("username", "password")
.disable_retry()
.build()
.await?;
```
## Error Handling
The client uses a comprehensive error type:
```rust
use wme_client::ClientError;
match result {
Ok(data) => println!("Success: {:?}", data),
Err(ClientError::Auth(msg)) => eprintln!("Authentication failed: {}", msg),
Err(ClientError::RateLimited { retry_after }) => {
eprintln!("Rate limited! Retry after: {:?} seconds", retry_after);
}
Err(ClientError::SnapshotNotFound { id }) => {
eprintln!("Snapshot not found: {}", id);
}
Err(ClientError::ArticleNotFound { name }) => {
eprintln!("Article not found: {}", name);
}
Err(e) => eprintln!("Error: {}", e),
}
```
## Advanced Usage
### Request Parameters
Most endpoints support filtering and field selection:
```rust
use wme_models::RequestParams;
let params = RequestParams {
filters: Some(vec![
("project".to_string(), "en.wikipedia".to_string()),
]),
fields: Some(vec![
"name".to_string(),
"url".to_string(),
]),
limit: Some(100),
offset: Some(0),
};
let projects = client.metadata().list_projects_with_params(Some(¶ms)).await?;
```
### Resume Realtime Stream
Resume from where you left off using timestamps:
```rust
use std::collections::HashMap;
// Per-partition resume (recommended for production)
let mut since_per_partition = HashMap::new();
since_per_partition.insert("0".to_string(), last_seen_timestamp);
since_per_partition.insert("1".to_string(), last_seen_timestamp);
let options = RealtimeConnectOptions::since_per_partition(since_per_partition);
let stream = client.realtime().connect(&options, None).await?;
```
### Circuit Breaker
The retry transport includes a circuit breaker that opens after consecutive failures:
- **Closed**: Normal operation
- **Open**: Requests fail fast to prevent cascading failures
- **HalfOpen**: Testing if service has recovered
Configure thresholds:
```rust
let retry = RetryConfig::new()
.with_circuit_threshold(10) // Open after 10 consecutive failures
.with_circuit_timeout(Duration::from_secs(120)); // Try again after 2 minutes
```
## License
This project is licensed under the MIT License.
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.