boilerpipe 0.3.0

Library for text extraction from HTML documents
Documentation
1
2
3
4
5
# Boilerpipe

This is the Rust port of the [Golang port](https://github.com/jlubawy/go-boilerpipe) of excellent [Java library](https://github.com/kohlschutter/boilerpipe) `boilerpipe` which cleans up the boilerplate and extracts text content from HTML documents.

This library implements Article Extractor only and text content only (no images, links etc).