Expand description
Python doctest extraction and corpus management.
This module provides tools for extracting Python doctests from source files and converting them to Arrow/Parquet format for ML training data.
Structs§
- DocTest
- A single extracted Python doctest.
- DocTest
Corpus - A corpus of extracted doctests from a Python source.
- DocTest
Parser - Parser for extracting Python doctests from source files.
Functions§
- is_
prose_ continuation - Returns true if
linelooks like a continuation of a prose paragraph.