Crate parse_wiktionary_cs

Source
Expand description

Parse dictionary pages from the Czech language edition of Wiktionary into structured data.

For general information about Parse Wiktionary, see the readme file.

§Examples

This example prints all definitions found in an article, together with the language and part of speech of the entry.

let wiki_text = concat!(
    "==čeština==\n",
    "===sloveso===\n",
    "====význam====\n",
    "#Protestovat proti absurdnímu příkazu nebo povinnosti prostřednictvím ",
    "jestě absurdněji puntičkářského a nadšeného chování"
);
let configuration = parse_wiktionary_cs::create_configuration();
let parsed_wiki_text = configuration.parse(wiki_text);
let parsed_article = parse_wiktionary_cs::parse(wiki_text, &parsed_wiki_text.nodes);
for language_entry in parsed_article.language_entries {
    for pos_entry in language_entry.pos_entries {
        for definition in pos_entry.definitions {
            println!(
                "The word 'švejkovat' of language {language:?} and part of speech {pos:?} has the definition: {definition}",
                language = language_entry.language,
                pos = pos_entry.pos,
                definition = &definition.definition.iter().map(|node| match node {
                    parse_wiktionary_cs::Flowing::Text { value } => value,
                    _ => ""
                }).collect::<String>()
            );
        }
    }
}

Modules§

inflection
Types specifying varyious patterns of inflection.

Structs§

Audio
Audio sample.
Definition
A single definition from a list of definitions of an entry.
ExternalLink
External link.
InflectionEntry
Information about one pattern of inflection for an entry.
LanguageEntry
Dictionary entry for a single language.
Output
Output of parsing a page.
PosEntry
The entry for a part of speech within the entry for a language.
Translations
The translations for a single definition.
Warning
Warning from the parser telling that something is not well-formed.

Enums§

ExternalLinkType
Identifier for a site as a target of an external link.
Flowing
An element in a sequence that allows different kinds of elements.
Inflection
Pattern of inflection.
Language
Identifier for a language.
Pos
Part of speech.
WarningMessage
Identifier for a kind of warning from the parser.

Functions§

create_configuration
Allocates and returns a configuration for Parse Wiki Text suitable for parsing the Czech language edition of Wiktionary.
parse
Parses an article from the Czech language version of Wiktionary into structured data.