boilerpipe 0.1.0

Library for text extraction from HTML documents
Documentation

Boilerpipe

This is the Rust port of the Golang port of excellent Java library boilerpipe which cleans up the boilerplate and extracts text content from HTML documents.

This library implements Article Extractor only and text content only (no images, links etc).