[−][src]Struct wikidump::Parser
A parser which can process uncompressed Mediawiki XML dumps (backups).
Methods
impl Parser
[src]
pub fn new<'c>() -> Parser
[src]
Construct a new parser with the default settings.
pub fn process_text(self, value: bool) -> Self
[src]
Sets whether the parser should process wiki text or leave it as-is. For best results, it is recommended you use a wiki config which matches the website you are parsing from. It may still work otherwise, but the results might be something unexpected.
Wiki text parsing is enabled by default.
See use_config and config.
Example
use wikidump::{Parser, config}; let parser = Parser::new() .use_config(config::wikipedia::english()) .process_text(false); // Disable wiki text parsing
pub fn exclude_pages(self, value: bool) -> Self
[src]
Sets whether the parser should ignore pages in namespaces that are not articles, such as Talk, Special, or User. If enabled, then any page which is not an article will be skipped by the parser.
Excluding pages in these namespaces is enabled by default.
Example
use wikidump::{Parser, config}; let parser = Parser::new() .use_config(config::wikipedia::english()) .exclude_pages(false); // Disable page exclusion
pub fn remove_newlines(self, value: bool) -> Self
[src]
Sets whether the parser should remove newlines or turn them into normal newline characters. This will only have an effect if processing wiki text is enabled.
Removing newlines is turned off by default.
Example
use wikidump::{Parser, config}; let parser = Parser::new() .use_config(config::wikipedia::english()) .remove_newlines(true) // Enable newline removal .process_text(true);
pub fn use_config(self, config_source: ConfigurationSource) -> Self
[src]
Sets the wiki text parser configuration options. For best results of processing wiki text, it is recommended to use the type of configuration that matches the website and language you are processing.
See config.
Example
use wikidump::{Parser, config}; let parser = Parser::new() .use_config(config::wikipedia::english());
pub fn parse_file<P>(&self, dump: P) -> Result<Site, Box<dyn Error + 'static>> where
P: AsRef<Path>,
[src]
P: AsRef<Path>,
Returns all of the parsed data contained in a particular wiki dump file. This includes the name of the website, a list of pages, their respective contents, and other properties.
Example
use wikidump::Parser; let parser = Parser::new(); let site = parser.parse_file("tests/enwiki-articles-partial.xml");
pub fn parse_str(&self, text: &str) -> Result<Site, Box<dyn Error + 'static>>
[src]
Returns all of the parsed data contained in a particular wiki dump file. This includes the name of the website, a list of pages, their respective contents, and other properties.
Example
use wikidump::Parser; use std::fs; let parser = Parser::new(); let contents = fs::read_to_string("tests/enwiki-articles-partial.xml").unwrap(); let site = parser.parse_str(contents.as_str());
Auto Trait Implementations
impl Sync for Parser
impl Send for Parser
impl Unpin for Parser
impl UnwindSafe for Parser
impl RefUnwindSafe for Parser
Blanket Implementations
impl<T, U> Into<U> for T where
U: From<T>,
[src]
U: From<T>,
impl<T> From<T> for T
[src]
impl<T, U> TryFrom<U> for T where
U: Into<T>,
[src]
U: Into<T>,
type Error = Infallible
The type returned in the event of a conversion error.
fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>
[src]
impl<T, U> TryInto<U> for T where
U: TryFrom<T>,
[src]
U: TryFrom<T>,
type Error = <U as TryFrom<T>>::Error
The type returned in the event of a conversion error.
fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>
[src]
impl<T> BorrowMut<T> for T where
T: ?Sized,
[src]
T: ?Sized,
fn borrow_mut(&mut self) -> &mut T
[src]
impl<T> Borrow<T> for T where
T: ?Sized,
[src]
T: ?Sized,
impl<T> Any for T where
T: 'static + ?Sized,
[src]
T: 'static + ?Sized,