Crate rs_conllu

source ·
Expand description

A library for parsing the CoNNL-U format.

§Basic Usage

Parse a sentence in CoNNL-U format and iterate over the containing Token elements. Example taken from CoNLL-U format description.

use rs_conllu::{parse_sentence, TokenID};

let s = "# sent_id = 1
# text = They buy and sell books.
1	They	they	PRON	PRP	Case=Nom|Number=Plur	2	nsubj	2:nsubj|4:nsubj	_
2	buy	buy	VERB	VBP	Number=Plur|Person=3|Tense=Pres	0	root	0:root	_
3	and	and	CCONJ	CC	_	4	cc	4:cc	_
4	sell	sell	VERB	VBP	Number=Plur|Person=4|Tense=Pres	2	conj	0:root|2:conj	_
6	books	book	NOUN	NNS	Number=Plur	2	obj	2:obj|4:obj	SpaceAfter=No
7	.	.	PUNCT	.	_	2	punct	2:punct	_
";

let sentence = parse_sentence(s).unwrap();
let mut token_iter = sentence.into_iter();

assert_eq!(token_iter.next().unwrap().id, TokenID::Single(1));
assert_eq!(token_iter.next().unwrap().form, "buy".to_owned());

Re-exports§

Modules§

Structs§

Enums§