wme-models 0.1.0

Type definitions for the Wikimedia Enterprise API
Documentation

wme-models

Type definitions for the Wikimedia Enterprise API.

This crate provides complete Rust type definitions for all Wikimedia Enterprise API endpoints including On-demand, Snapshot, and Realtime APIs. All types implement Serialize and Deserialize for seamless JSON/NDJSON handling.

Overview

Wikimedia Enterprise provides three main API services:

  • On-demand API - Query individual articles by name across all projects
  • Snapshot API - Download complete project dumps as compressed tarballs
  • Realtime API - Stream article updates via SSE or download hourly batches

Features

  • Complete type coverage - All API response types including articles, versions, events, and metadata
  • Consistent schema - Same Article type works across all three APIs
  • Forward compatibility - Schema envelope for handling API evolution
  • Structured content - Parsed infoboxes, sections, and tables (BETA)
  • Zero-copy parsing - Optional borrowed feature for Cow<str> support

Quick Start

use wme_models::{Article, SnapshotIdentifier};
use serde_json;

fn main() -> Result<(), serde_json::Error> {
    // Parse article from NDJSON (Snapshot/Realtime)
    let json = r#"{
        "name": "Squirrel",
        "identifier": 28492,
        "url": "https://en.wikipedia.org/wiki/Squirrel",
        "date_created": "2001-01-15T00:00:00Z",
        "date_modified": "2024-01-15T12:00:00Z",
        "in_language": {"identifier": "en", "name": "English"},
        "is_part_of": {"identifier": "enwiki"},
        "namespace": {"identifier": 0, "name": ""},
        "license": [{"name": "CC BY-SA 4.0", "url": "https://creativecommons.org/licenses/by-sa/4.0/"}],
        "version": {
            "identifier": 1182847293,
            "editor": {"identifier": 12345, "name": "SomeUser"}
        }
    }"#;
    let article: Article = serde_json::from_str(json)?;

    // Create snapshot identifier
    let id = SnapshotIdentifier::new("en", "wiki", 0);
    assert_eq!(id.to_string(), "enwiki_namespace_0");
    Ok(())
}

Core Types

Article Types

Version & Edit Information

  • Version - Revision metadata with credibility signals
  • Editor - Editor information including groups and rights
  • Scores - Quality scores (revert risk, reference risk, reference need)

Event Types (Realtime API)

Request Parameters

  • RequestParams - Build API requests with fields, filters, and limits
  • Filter - Field-based filtering using dot notation

Metadata Types

API-Specific Notes

On-demand API

Returns single articles or arrays. Use RequestParams to filter by language/project:

use wme_models::{RequestParams, FilterValue};

let params = RequestParams::new()
    .field("name")
    .field("url")
    .filter("in_language.identifier", "en")
    .limit(3);

Snapshot API

Returns NDJSON in tar.gz files. Articles may contain duplicates (< 1%) - use the one with the highest version.identifier:

use wme_models::Article;

fn keep_latest(existing: &Article, incoming: &Article) -> bool {
    incoming.version.identifier > existing.version.identifier
}

Realtime API

Streaming endpoint returns SSE or NDJSON. Events include partition/offset for resumability:

use wme_models::Article;

fn example(article: Article) {
    if let Some(event) = &article.event {
        println!("Event {} at partition {}, offset {}", 
            event.identifier,
            event.partition.unwrap_or(0),
            event.offset.unwrap_or(0)
        );
    }
}

Data Dictionary Compliance

All types follow the Wikimedia Enterprise Data Dictionary:

  • Required fields are non-optional in structs
  • Optional fields use Option<T>
  • Omitempty fields use Option<T> with skip_serializing_if where appropriate
  • Credibility signals marked in documentation (scores, editor info, maintenance tags)

Feature Flags

  • borrowed - Enable borrowed deserialization with Cow<str> for zero-copy parsing (reduces allocations when processing large NDJSON files)

License

This project is licensed under the terms of the workspace license.