Crate rs_conllu

Source
Expand description

A library for parsing the CoNNL-U format.

§Basic Usage

use rs_conllu::parse_file;
use std::fs::File;

let file = File::open("tests/example.conllu")?;

let parsed = parse_file(file)?;

// parse_file returns a `ParsedDoc`, which allows iteration
// over the contained sentences.
for sentence in parsed {
    // We can also iterate over the tokens in the sentence.
    for token in sentence {
        // Process token, e.g. access individual fields.
        println!("{}", token.form)
    }
}

§Modifying

If manipulation is necessary, sentences can be iterated mutably. The example below shows how we can change the form and lemma of a particular token.

use rs_conllu::{parse_file, Sentence, TokenID};
use std::fs::File;

let file = File::open("tests/example.conllu")?;

let mut parsed = parse_file(file)?;

if let Some(s) = parsed.iter_mut().nth(0) {
    if let Some(token) = s.get_token_mut(TokenID::Single(6)) {
        token.form = "crabs".to_string();
        token.lemma = Some("crab".to_string());
    }
}

Re-exports§

pub use sentence::Sentence;
pub use token::Dep;
pub use token::Token;
pub use token::TokenID;
pub use parsers::parse_file;
pub use parsers::parse_sentence;
pub use parsers::parse_token;

Modules§

parsers
Parsers for tokens, sentences and whole documents, and associated code.
sentence
Sentence and the related builder.
token
The basic token element, its building blocks and builder.

Structs§

ParseUposError
Error used when a Universal POS tag could not be parsed.

Enums§

UPOS
The set of Universal POS tags according to UD version 2.