Expand description
parse_mediawiki_sql
parses SQL dumps of a MediaWiki database.
The SQL dumps are scripts that create a database table and insert rows into it.
The entry point is iterate_sql_insertions
, which creates an iterable struct
from a byte slice (&[u8]
). The struct is generic over the type returned by the iterator,
and this type must be one of the structs in the schemas
module,
which represent rows in the database, such as Page
.
§Usage
This crate is available from crates.io and can be
used by adding parse-mediawiki-sql
to your dependencies in your project’s Cargo.toml
.
[dependencies]
parse-mediawiki-sql = "0.10"
If you’re using Rust 2015, then you’ll also need to add it to your crate root:
extern crate parse_mediawiki_sql;
§Example
To generate a Vec
containing the titles of all redirect pages:
use parse_mediawiki_sql::{
iterate_sql_insertions,
schemas::Page,
field_types::{PageNamespace, PageTitle},
utils::memory_map,
};
use std::fs::File;
let page_sql = unsafe { memory_map("page.sql")? };
let redirects: Vec<(PageNamespace, PageTitle)> =
iterate_sql_insertions(&page_sql)
.filter_map(
|Page { namespace, title, is_redirect, .. }| {
if is_redirect {
Some((namespace, title))
} else {
None
}
},
)
.collect();
Only a mutable reference to the struct is iterable, so a for
-loop
must use &mut
or .into_iter()
to iterate over the struct:
for Page { namespace, title, is_redirect, .. } in &mut iterate_sql_insertions(&page_sql) {
if is_redirect {
dbg!((namespace, title));
}
}
Re-exports§
Modules§
- error
- The error types used by
FromSqlTuple
andFromSql
. - field_
types - The types used in the
schemas
module. - from_
sql - Defines the
FromSql
trait and implements it for external types. - schemas
- Types that represent rows in tables of the MediaWiki database.
- utils
utils
- Defines
memory_map
to read decompressed MediaWiki SQL files, andNamespaceMap
to display a page title prefixed by its namespace name.
Traits§
- From
SqlTuple - Trait for converting from a SQL tuple to a Rust type,
which can borrow from the string or not.
Used by
iterate_sql_insertions
.
Functions§
- iterate_
sql_ insertions - The entry point of the crate. Takes a SQL dump of a MediaWiki database table as bytes and yields an iterator over structs representing rows in the table.