wme-models 0.1.3

Type definitions for the Wikimedia Enterprise API
Documentation
# wme-models

Type definitions for the [Wikimedia Enterprise API](https://enterprise.wikimedia.com/docs/).

This crate provides complete Rust type definitions for all Wikimedia Enterprise API endpoints including On-demand, Snapshot, and Realtime APIs. All types implement `Serialize` and `Deserialize` for seamless JSON/NDJSON handling.

## Overview

Wikimedia Enterprise provides three main API services:

- **On-demand API** - Query individual articles by name across all projects
- **Snapshot API** - Download complete project dumps as compressed tarballs
- **Realtime API** - Stream article updates via SSE or download hourly batches

## Features

- **Complete type coverage** - All API response types including articles, versions, events, and metadata
- **Consistent schema** - Same `Article` type works across all three APIs
- **Forward compatibility** - Schema envelope for handling API evolution
- **Structured content** - Parsed infoboxes, sections, and tables (BETA)
- **Zero-copy parsing** - Optional `borrowed` feature for `Cow<str>` support

## Quick Start

```rust
use wme_models::{Article, SnapshotIdentifier};
use serde_json;

fn main() -> Result<(), serde_json::Error> {
    // Parse article from NDJSON (Snapshot/Realtime)
    let json = r#"{
        "name": "Squirrel",
        "identifier": 28492,
        "url": "https://en.wikipedia.org/wiki/Squirrel",
        "date_created": "2001-01-15T00:00:00Z",
        "date_modified": "2024-01-15T12:00:00Z",
        "in_language": {"identifier": "en", "name": "English"},
        "is_part_of": {"identifier": "enwiki"},
        "namespace": {"identifier": 0, "name": ""},
        "license": [{"name": "CC BY-SA 4.0", "url": "https://creativecommons.org/licenses/by-sa/4.0/"}],
        "version": {
            "identifier": 1182847293,
            "editor": {"identifier": 12345, "name": "SomeUser"}
        }
    }"#;
    let article: Article = serde_json::from_str(json)?;

    // Create snapshot identifier
    let id = SnapshotIdentifier::new("en", "wiki", 0);
    assert_eq!(id.to_string(), "enwiki_namespace_0");
    Ok(())
}
```

## Core Types

### Article Types

- [`Article`]crate::Article - Complete article from any API
- [`StructuredArticle`]crate::StructuredArticle - Article with parsed content (BETA)

### Version & Edit Information

- [`Version`]crate::Version - Revision metadata with credibility signals
- [`Editor`]crate::Editor - Editor information including groups and rights
- [`Scores`]crate::Scores - Quality scores (revert risk, reference risk, reference need)

### Event Types (Realtime API)

- [`EventMetadata`]crate::EventMetadata - Event tracking with partition/offset
- [`EventType`]crate::EventType - `update`, `delete`, or `visibility-change`

### Request Parameters

- [`RequestParams`]crate::RequestParams - Build API requests with fields, filters, and limits
- [`Filter`]crate::Filter - Field-based filtering using dot notation

### Metadata Types

- [`Language`]crate::Language - Language code, name, and direction
- [`Namespace`]crate::Namespace - Namespace ID with description
- [`ProjectInfo`]crate::ProjectInfo - Full project metadata
- [`SnapshotInfo`]crate::SnapshotInfo - Snapshot metadata with chunks
- [`ChunkInfo`]crate::ChunkInfo - Download chunk information

## API-Specific Notes

### On-demand API

Returns single articles or arrays. Use `RequestParams` to filter by language/project:

```rust
use wme_models::{RequestParams, FilterValue};

let params = RequestParams::new()
    .field("name")
    .field("url")
    .filter("in_language.identifier", "en")
    .limit(3);
```

### Snapshot API

Returns NDJSON in tar.gz files. Articles may contain duplicates (< 1%) - use the one with the highest `version.identifier`:

```rust
use wme_models::Article;

fn keep_latest(existing: &Article, incoming: &Article) -> bool {
    incoming.version.identifier > existing.version.identifier
}
```

### Realtime API

Streaming endpoint returns SSE or NDJSON. Events include partition/offset for resumability:

```rust
use wme_models::Article;

fn example(article: Article) {
    if let Some(event) = &article.event {
        println!("Event {} at partition {}, offset {}", 
            event.identifier,
            event.partition.unwrap_or(0),
            event.offset.unwrap_or(0)
        );
    }
}
```

## Data Dictionary Compliance

All types follow the [Wikimedia Enterprise Data Dictionary](https://enterprise.wikimedia.com/docs/data-dictionary):

- **Required** fields are non-optional in structs
- **Optional** fields use `Option<T>`
- **Omitempty** fields use `Option<T>` with `skip_serializing_if` where appropriate
- **Credibility signals** marked in documentation (scores, editor info, maintenance tags)

## Feature Flags

- `borrowed` - Enable borrowed deserialization with `Cow<str>` for zero-copy parsing (reduces allocations when processing large NDJSON files)

## License

This project is licensed under the terms of the workspace license.