Skip to main content

Crate wme_models

Crate wme_models 

Source
Expand description

§wme-models

Type definitions for the Wikimedia Enterprise API.

This crate provides complete Rust type definitions for all Wikimedia Enterprise API endpoints including On-demand, Snapshot, and Realtime APIs. All types implement Serialize and Deserialize for seamless JSON/NDJSON handling.

§Overview

Wikimedia Enterprise provides three main API services:

  • On-demand API - Query individual articles by name across all projects
  • Snapshot API - Download complete project dumps as compressed tarballs
  • Realtime API - Stream article updates via SSE or download hourly batches

§Features

  • Complete type coverage - All API response types including articles, versions, events, and metadata
  • Consistent schema - Same Article type works across all three APIs
  • Forward compatibility - Schema envelope for handling API evolution
  • Structured content - Parsed infoboxes, sections, and tables (BETA)
  • Zero-copy parsing - Optional borrowed feature for Cow<str> support

§Quick Start

use wme_models::{Article, SnapshotIdentifier};
use serde_json;

fn main() -> Result<(), serde_json::Error> {
    // Parse article from NDJSON (Snapshot/Realtime)
    let json = r#"{
        "name": "Squirrel",
        "identifier": 28492,
        "url": "https://en.wikipedia.org/wiki/Squirrel",
        "date_created": "2001-01-15T00:00:00Z",
        "date_modified": "2024-01-15T12:00:00Z",
        "in_language": {"identifier": "en", "name": "English"},
        "is_part_of": {"identifier": "enwiki"},
        "namespace": {"identifier": 0, "name": ""},
        "license": [{"name": "CC BY-SA 4.0", "url": "https://creativecommons.org/licenses/by-sa/4.0/"}],
        "version": {
            "identifier": 1182847293,
            "editor": {"identifier": 12345, "name": "SomeUser"}
        }
    }"#;
    let article: Article = serde_json::from_str(json)?;

    // Create snapshot identifier
    let id = SnapshotIdentifier::new("en", "wiki", 0);
    assert_eq!(id.to_string(), "enwiki_namespace_0");
    Ok(())
}

§Core Types

§Article Types

§Version & Edit Information

  • Version - Revision metadata with credibility signals
  • Editor - Editor information including groups and rights
  • Scores - Quality scores (revert risk, reference risk, reference need)

§Event Types (Realtime API)

§Request Parameters

  • RequestParams - Build API requests with fields, filters, and limits
  • Filter - Field-based filtering using dot notation

§Metadata Types

§API-Specific Notes

§On-demand API

Returns single articles or arrays. Use RequestParams to filter by language/project:

use wme_models::{RequestParams, FilterValue};

let params = RequestParams::new()
    .field("name")
    .field("url")
    .filter("in_language.identifier", "en")
    .limit(3);

§Snapshot API

Returns NDJSON in tar.gz files. Articles may contain duplicates (< 1%) - use the one with the highest version.identifier:

use wme_models::Article;

fn keep_latest(existing: &Article, incoming: &Article) -> bool {
    incoming.version.identifier > existing.version.identifier
}

§Realtime API

Streaming endpoint returns SSE or NDJSON. Events include partition/offset for resumability:

use wme_models::Article;

fn example(article: Article) {
    if let Some(event) = &article.event {
        println!("Event {} at partition {}, offset {}", 
            event.identifier,
            event.partition.unwrap_or(0),
            event.offset.unwrap_or(0)
        );
    }
}

§Data Dictionary Compliance

All types follow the Wikimedia Enterprise Data Dictionary:

  • Required fields are non-optional in structs
  • Optional fields use Option<T>
  • Omitempty fields use Option<T> with skip_serializing_if where appropriate
  • Credibility signals marked in documentation (scores, editor info, maintenance tags)

§Feature Flags

  • borrowed - Enable borrowed deserialization with Cow<str> for zero-copy parsing (reduces allocations when processing large NDJSON files)

§License

This project is licensed under the terms of the workspace license.

Re-exports§

pub use article::Article;
pub use article::ProjectRef;
pub use article::StructuredArticle;
pub use article::Visibility;
pub use content::ArticleBody;
pub use content::Image;
pub use content::License;
pub use envelope::ArticleEnvelope;
pub use error::ModelError;
pub use metadata::BatchInfo;
pub use metadata::ChunkInfo;
pub use metadata::EventMetadata;
pub use metadata::EventType;
pub use metadata::Language;
pub use metadata::Namespace;
pub use metadata::Project;
pub use metadata::ProjectInfo;
pub use metadata::ProjectType;
pub use metadata::RealtimeBatchInfo;
pub use metadata::RealtimeProject;
pub use metadata::SimplifiedLanguage;
pub use metadata::SimplifiedNamespace;
pub use metadata::Size;
pub use metadata::SnapshotInfo;
pub use reference::Citation;
pub use reference::Reference;
pub use request::Filter;
pub use request::FilterValue;
pub use request::RequestParams;
pub use structured::Infobox;
pub use structured::Section;
pub use structured::Table;
pub use structured::TableReference;
pub use version::ArticleSize;
pub use version::Editor;
pub use version::MaintenanceTags;
pub use version::PreviousVersion;
pub use version::Protection;
pub use version::ReferenceNeed;
pub use version::ReferenceRisk;
pub use version::RevertRisk;
pub use version::Scores;
pub use version::Version;

Modules§

article
Article types including Article and StructuredArticle. Article types for the Wikimedia Enterprise API.
content
Content types including ArticleBody, Image, and License. Content types for the Wikimedia Enterprise API.
envelope
Schema envelope for version compatibility. Schema envelope for version compatibility.
error
Error types for model operations. Error types for model operations.
metadata
Metadata types including EventMetadata, EventType, and identifiers. Metadata types for the Wikimedia Enterprise API.
reference
Reference and citation types. Reference and citation types.
request
Request parameter types for API queries. Request parameter types for API queries.
structured
Structured content types including Infobox, Section, and Table. Structured content types (BETA).
version
Version and editor information types. Version and editor information types.

Structs§

Category
Category reference.
Link
Link within content.
RealtimeWikidataEntity
Wikidata entity reference with aspects (used in Realtime API).
Redirect
Redirect reference.
SnapshotIdentifier
Snapshot identifier: {language}{project}namespace{number}
Template
Template reference.
WikidataEntity
Wikidata entity reference.
WikidataEntityUsage
Wikidata entity usage information.