Expand description
A library for parsing the CoNNL-U format.
§Basic Usage
use rs_conllu::parse_file;
use std::fs::File;
let file = File::open("tests/example.conllu")?;
let parsed = parse_file(file)?;
// parse_file returns a `ParsedDoc`, which allows iteration
// over the contained sentences.
for sentence in parsed {
// We can also iterate over the tokens in the sentence.
for token in sentence {
// Process token, e.g. access individual fields.
println!("{}", token.form)
}
}
§Modifying
If manipulation is necessary, sentences can be iterated
mutably. The example below shows how we can change the
form and lemma of a particular token.
use rs_conllu::{parse_file, Sentence, TokenID};
use std::fs::File;
let file = File::open("tests/example.conllu")?;
let mut parsed = parse_file(file)?;
if let Some(s) = parsed.iter_mut().nth(0) {
if let Some(token) = s.get_token_mut(TokenID::Single(6)) {
token.form = "crabs".to_string();
token.lemma = Some("crab".to_string());
}
}
Re-exports§
pub use sentence::Sentence;pub use token::Dep;pub use token::Token;pub use token::TokenID;pub use parsers::parse_file;pub use parsers::parse_sentence;pub use parsers::parse_token;
Modules§
- parsers
- Parsers for tokens, sentences and whole documents, and associated code.
- sentence
- Sentence and the related builder.
- token
- The basic token element, its building blocks and builder.
Structs§
- Parse
Upos Error - Error used when a Universal POS tag could not be parsed.
Enums§
- UPOS
- The set of Universal POS tags according to UD version 2.