[][src]Crate parse_mediawiki_sql

parse_mediawiki_sql parses SQL dumps of a MediaWiki database. The SQL dumps are scripts that create a database table and insert rows into it. The entry point is iterate_sql_insertions, which creates an iterable struct from a byte slice (&[u8]). The struct is generic over the type returned by the iterator, and this type must be one of the structs in the schemas module, which represent rows in the database, such as Page.

Usage

This crate is available from crates.io and can be used by adding parse-mediawiki-sql to your dependencies in your project's Cargo.toml.

[dependencies]
parse-mediawiki-sql = "0.1"

If you're using Rust 2015, then you’ll also need to add it to your crate root:

extern crate parse_mediawiki_sql;

Example

To generate a Vec containing the titles of all redirect pages:

use memmap::Mmap;
use parse_mediawiki_sql::{
    iterate_sql_insertions,
    schemas::Page,
    types::{PageNamespace, PageTitle},
};
use std::fs::File;
let page_sql =
    unsafe { Mmap::map(&File::open("page.sql").unwrap()).unwrap() };
let redirects: Vec<(PageNamespace, PageTitle)> =
    iterate_sql_insertions(&page_sql)
        .filter_map(
            |Page { namespace, title, is_redirect, .. }| {
                if is_redirect {
                    Some((namespace, title))
                } else {
                    None
                }
            },
        )
        .collect();

Only a mutable reference to the struct is iterable, so a for-loop must use &mut or .into_iter() to iterate over the struct:

for Page { namespace, title, is_redirect, .. } in &mut iterate_sql_insertions(&page_sql) {
    if is_redirect {
        dbg!((namespace, title));
    }
}

Re-exports

pub use types::Error;
pub use types::IResult;

Modules

schemas

Defines types that represent rows in tables of the MediaWiki database and implements the FromSqlTuple trait for them, so that they can be parsed from SQL tuples.

types

Defines the types used in the schemas module and implements the FromSql trait for these and other types, so that they can be parsed from SQL syntax. Re-exports the Datelike and Timelike traits from the chrono crate, which are used by Timestamp.

Traits

FromSqlTuple

Trait for converting from a SQL tuple to a Rust type, which can borrow from the string or not. Used by iterate_sql_insertions.

Functions

iterate_sql_insertions

Takes a SQL dump of a MediaWiki database table as bytes and yields a struct that is iterable as a mutable reference, yielding structs representing the database rows.