Expand description
Training data format generators.
Converts curriculum-ordered triples into JSONL and Alpaca instruction format for language model fine-tuning.
Structs§
- Training
Example - A single training example in JSONL format.
Functions§
- section_
counts - Count examples per section.
- to_
jsonl - Generate JSONL training data from a curriculum.
- to_
jsonl_ random - Generate randomly-ordered JSONL from the same triples (baseline).