three-dcf-core 0.2.0

Document-to-dataset encoding library for LLM training data preparation. Converts PDFs, Markdown, HTML into structured formats optimized for machine learning.
Documentation

three-dcf-core

There is very little structured metadata to build this page from currently. You should check the main library docs, readme, or Cargo.toml in case the author documented the features in them.

This version has 4 feature flags, 1 of them enabled by default.

default

text (default)

This feature flag does not enable additional features.

full

ocr

pdfium