Crate trek_rs

Source
Expand description

Trek - A modern web content extraction library

Trek removes clutter from web pages and extracts clean, readable content. It’s designed as a modern alternative to Mozilla Readability with enhanced features like mobile-aware extraction and consistent HTML standardization.

Re-exports§

pub use crate::types::MetaTagItem;
pub use crate::types::TrekOptions;
pub use crate::types::TrekResponse;

Modules§

constants
Constants used throughout Trek
elements
Element processors for different content types
error
Error types for Trek
extractor
Site-specific content extractors
extractors
Site-specific extractors module
html_to_text
Convert HTML to readable plain text while preserving structure
metadata
Metadata extraction functionality
scoring
Content scoring algorithm for Trek
standardize
HTML standardization functionality
types
Type definitions for Trek
utils
Utility functions for Trek

Structs§

CollectedData
Trek
Main Trek struct for content extraction