wme-client
A robust, production-ready HTTP client for the Wikimedia Enterprise API.
Features
- Complete API Coverage: Access all Wikimedia Enterprise endpoints including metadata, on-demand, snapshots, and realtime streaming
- Authentication Management: Automatic token refresh and secure credential handling
- Resilient by Design: Built-in retry logic with exponential backoff, jitter, and circuit breaker patterns
- Streaming Support: Efficiently handle large snapshot downloads and realtime SSE streams
- Type-Safe: Full Rust type definitions for all API responses
- Async/Await: Built on tokio for high-performance asynchronous operations
Installation
Add this to your Cargo.toml:
[]
= "0.1.3"
= { = "1", = ["full"] }
Quick Start
use WmeClient;
async
Authentication
The client supports username/password authentication with automatic token management:
let client = builder
.credentials
.build
.await?;
Tokens are automatically refreshed before expiration. You can also manually revoke tokens:
// Get the token manager
if let Some = client.token_manager
API Clients
Metadata Client
Discover available projects, languages, and namespaces:
let metadata = client.metadata;
// List all projects
let projects = metadata.list_projects.await?;
// Get specific project info
let wikipedia = metadata.get_project.await?;
// List languages
let languages = metadata.list_languages.await?;
// List namespaces for a project
let namespaces = metadata.list_namespaces.await?;
On-Demand Client
Fetch individual articles:
let on_demand = client.on_demand;
// Get a single article
let articles = on_demand.get_article.await?;
// Get multiple articles efficiently
let articles = on_demand.get_articles.await?;
// Get structured article data (BETA)
let structured = on_demand.get_structured_article.await?;
Snapshot Client
Download bulk data snapshots:
use StreamExt;
let snapshot = client.snapshot;
// List available snapshots
let snapshots = snapshot.list_snapshots.await?;
// Get snapshot metadata
let info = snapshot.get_snapshot_info.await?;
// Download a snapshot as a stream
let mut stream = snapshot.download_snapshot.await?;
let mut data = Vecnew;
while let Some = stream.next.await
// Download specific chunks
let chunks = snapshot.list_chunks.await?;
let mut stream = snapshot.download_chunk.await?;
Realtime Client
Stream article updates in real-time:
use RealtimeConnectOptions;
use Utc;
use StreamExt;
let realtime = client.realtime;
// Connect to live stream
let options = since;
let mut stream = realtime.connect.await?;
while let Some = stream.next.await
Realtime Batches
For historical realtime data, use batches:
// List available batches
let batches = realtime.list_batches.await?;
// Stream a batch as parsed articles
let mut stream = realtime.stream_batch.await?;
while let Some = stream.next.await
Configuration
Retry Configuration
Customize retry behavior for your use case:
use RetryConfig;
use Duration;
// Production-grade configuration
let retry = production;
// Development configuration (faster retries)
let retry = development;
// Batch processing (more retries, longer delays)
let retry = batch_processing;
// Custom configuration
let retry = new
.with_max_retries
.with_base_delay
.with_max_delay
.with_jitter;
let client = builder
.credentials
.retry
.build
.await?;
Custom Base URLs
For testing or private deployments:
let client = builder
.api_url
.auth_url
.realtime_url
.credentials
.build
.await?;
Timeout Configuration
let client = builder
.credentials
.timeout
.build
.await?;
Disable Retry Logic
For debugging or when you want full control:
let client = builder
.credentials
.disable_retry
.build
.await?;
Error Handling
The client uses a comprehensive error type:
use ClientError;
match result
Advanced Usage
Request Parameters
Most endpoints support filtering and field selection:
use RequestParams;
let params = RequestParams ;
let projects = client.metadata.list_projects_with_params.await?;
Resume Realtime Stream
Resume from where you left off using timestamps:
use HashMap;
// Per-partition resume (recommended for production)
let mut since_per_partition = new;
since_per_partition.insert;
since_per_partition.insert;
let options = since_per_partition;
let stream = client.realtime.connect.await?;
Circuit Breaker
The retry transport includes a circuit breaker that opens after consecutive failures:
- Closed: Normal operation
- Open: Requests fail fast to prevent cascading failures
- HalfOpen: Testing if service has recovered
Configure thresholds:
let retry = new
.with_circuit_threshold // Open after 10 consecutive failures
.with_circuit_timeout; // Try again after 2 minutes
License
This project is licensed under the MIT License.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.