Crate parse_wiktionary_en
source ·Expand description
Parse dictionary pages from the English language edition of Wiktionary into structured data.
For general information about Parse Wiktionary, see the readme file.
Examples
This example prints all definitions found in an article, together with the language and part of speech of the entry.
let wiki_text = concat!(
"==English==\n",
"===Noun===\n",
"#The assignment of a [[commercial]] [[value]] to something previously valueless."
);
let configuration = parse_wiktionary_en::create_configuration();
let parsed_wiki_text = configuration.parse(wiki_text);
let parsed_article = parse_wiktionary_en::parse(wiki_text, &parsed_wiki_text.nodes);
for language_entry in parsed_article.language_entries {
for pos_entry in language_entry.etymology_entry.pos_entries {
for definition in pos_entry.definitions {
println!(
"The word 'commodification' of language {language:?} and part of speech {pos:?} has the definition: {definition}",
language = language_entry.language,
pos = pos_entry.pos,
definition = &definition.definition.iter().map(|node| match node {
parse_wiktionary_en::Flowing::Link { target, text } => text,
parse_wiktionary_en::Flowing::Text { value } => value,
_ => ""
}).collect::<String>()
);
}
}
}
Structs
A single definition from a list of definitions of an entry.
Details related to a specific etymology, either one that has a numbered etymology heading or the same format of information directly in the language entry.
Dictionary entry for a single language.
Output of parsing a page.
The entry for a part of speech within the entry for a language.
Details about a template.
Warning from the parser telling that something is not well-formed.
Enums
An element in a sequence that allows different kinds of elements.
Identifier for a language.
Part of speech.
Identifier for a kind of warning from the parser.
Functions
Allocates and returns a configuration for Parse Wiki Text suitable for parsing the English language edition of Wiktionary.
Parses an article from the English language version of Wiktionary into structured data.