oscar-tools 0.4.0

Tools for processing OSCAR Corpora
Documentation
1
2
3
4
5
6
7
# OSCAR-tools

This is a new set of tools to do common tasks on the [OSCAR corpus](https://oscar-corpus.com)

The program has a different set of tools for each corpus version:
- `v1`: OSCAR 2019-like, text only (.txt files)
- `v2`: OSCAR 22.01-like, JSONLines, document-oriented with annotations and line-level identifications