Expand description
Trek - A modern web content extraction library
Trek removes clutter from web pages and extracts clean, readable content. It’s designed as a modern alternative to Mozilla Readability with enhanced features like mobile-aware extraction and consistent HTML standardization.
Re-exports§
pub use crate::types::MetaTagItem;
pub use crate::types::TrekOptions;
pub use crate::types::TrekResponse;
Modules§
- constants
- Constants used throughout Trek
- elements
- Element processors for different content types
- error
- Error types for Trek
- extractor
- Site-specific content extractors
- extractors
- Site-specific extractors module
- html_
to_ text - Convert HTML to readable plain text while preserving structure
- metadata
- Metadata extraction functionality
- scoring
- Content scoring algorithm for Trek
- standardize
- HTML standardization functionality
- types
- Type definitions for Trek
- utils
- Utility functions for Trek
Structs§
- Collected
Data - Trek
- Main Trek struct for content extraction