TFORM.IO
A Rust crate that cleans and converts large, poorly formatted text into well-structured Markdown or HTML.
Designed for streaming (line-by-line) processing of up to hundreds of megabytes of text, TFORM.IO removes extra spaces, merges paragraphs, detects headings/lists/code blocks, and lets you override defaults with a simple configuration file.
Table of Contents
Features
Streaming Support
Process very large text files or streams without loading them entirely into memory.
Automatic Markdown/HTML Conversion
- Headings (lines starting with
#,##, etc.) - Bullet lists (lines starting with
-,+, or*) - Code blocks (triple backticks)
- Paragraph separation on blank lines
Configurable
- Enable/disable headings, list detection, or space-trimming.
- Define custom regex patterns (future expansion).
- Load config from TOML or JSON files.
High Performance
Written in Rust to handle up to 512 MB of input efficiently.
Installation
Add TFORM.IO to your Cargo.toml:
[]
= "0.1.0"
Then run:
Usage
Here’s a minimal example of using TFORM.IO:
use ;
use Error;
Compile and run:
Configuration
By default, TFORM.IO uses:
= true
= true
= true
= []
You can override these by creating a tform_config.toml or JSON file. For example:
# tform_config.toml
= true
= false
= true
= ["(?i)todo"]
Then load it:
let config = from_file.unwrap_or_default;
let formatter = new;
If detect_headings = false, # Some Text is treated as normal paragraph text instead of a heading.
Examples
TFORM.IO includes example programs under the examples/ folder. You can run them with:
Basic Formatting
Custom Rules
Streaming
Each example demonstrates different aspects of TFORM.IO, like loading a config, processing large files line-by-line, or basic text transformations.
Testing
We have both unit tests (within modules) and integration tests (in tests/integration_tests.rs). Run them all:
This validates:
- Heading detection (HTML & Markdown)
- List detection
- Code block handling via triple backticks
- Custom config usage (e.g., disabling headings)
Contributing
- Fork the repository and clone it locally.
- Create a branch for your feature or bug fix.
- Write tests that cover your changes.
- Submit a Pull Request on GitHub with a clear description of your work. We welcome all suggestions and improvements!
License
This project is licensed under the MIT License. You’re free to use, modify, and distribute this software under its terms.
Enjoy TFORM.IO! With TFORM.IO, you can painlessly convert jumbled text into neat Markdown or HTML—perfect for documentation, PDF generation, or any structured text workflow. If you have any questions or feedback, feel free to open an issue or submit a pull request.