ptb-reader
Natural language processing work on syntax starts with being able to read standard corpora, such as the Penn Treebank. This crate is able to parse the merged (i.e. syntactic structure and POS tags) files of the free sample and the full PTB Wall Street Journal section.
Data types
The output of the parser is a Vec<PTBTree>, PTBTree being defined by:
PTBTree implements the Display trait, showing the PTB-bracketed notation. It also supports From/Into String, yielding the front of the tree (i.e., the concatenation of all terminals).
Example usage
let all_trees: = parse_ptb_sample_dir;
let s: String = "((S (NNP John) (VP (VBD saw) (NNP Mary))))";
let t: PTBTree =
InnerNode
;
assert_eq!;
assert_eq!;
assert_eq!;