Expand description
§Webpage Quality Analyzer
Version: 1.0.2
A high-performance Rust crate for analyzing webpage quality with 3 simple usage levels:
What’s New in 1.0.2:
- Fixed
MetricEquals
penalty trigger implementation - Product profile reverted to e-commerce focus (breaking change - use “general” for software)
- Login page validation now stricter (Forms category weight: 40%)
- Homepage and General profiles more balanced with new bonuses
- Enhanced minimum baseline scoring with better documentation
§Quick Start (Level 1 - Simple)
use webpage_quality_analyzer::{analyze, analyze_with_profile};
// Analyze by fetching URL
let report = analyze("https://example.com", None).await?;
// Analyze provided HTML
let html = "<html><head><title>Test</title></head><body><p>Content</p></body></html>";
let report = analyze("https://example.com", Some(html)).await?;
// Use specific profile
let report = analyze_with_profile("https://example.com", None, "news").await?;
§Custom Configuration (Level 2 - Builder)
use webpage_quality_analyzer::{Analyzer, async_runtime::DefaultRuntime};
let analyzer = Analyzer::<DefaultRuntime>::builder()
.with_profile_name("content_article")?
.enable_linkcheck(true)
.enable_nlp(true)
.build()?;
let report = analyzer.run("https://example.com", None).await?;
§Advanced Setup (Level 3 - Config File)
use webpage_quality_analyzer::from_config_file;
let analyzer = from_config_file("my-config.yaml")?;
// Use analyzer.run() for analysis
Re-exports§
pub use models::models::AnalysisMode;
pub use models::models::AnalyzeError;
pub use models::models::AppliedBonusInfo;
pub use models::models::AppliedPenaltyInfo;
pub use models::models::ContentChunk;
pub use models::models::ContentChunkType;
pub use models::models::ContentComplianceInfo;
pub use models::models::ExtractedMetadata;
pub use models::models::FormInfo;
pub use models::models::Heading;
pub use models::models::ImageInfo;
pub use models::models::LinkInfo;
pub use models::models::MediaInfo;
pub use models::models::PageMetrics;
pub use models::models::PageMetricsFromHTML;
pub use models::models::PageMetricsFullFetch;
pub use models::models::PageQualityReport;
pub use models::models::Phase3ScoringDetails;
pub use models::models::ProcessedDocument;
pub use models::models::QualityBand;
pub use models::models::Result;
pub use models::models::StructuredData;
pub use content_extraction::create_content_extractor;
pub use content_extraction::create_heuristic_extractor;
pub use content_extraction::create_readability_extractor;
pub use content_extraction::ContentExtractor;
pub use content_extraction::ContentScore;
pub use content_extraction::ExtractionStrategy;
pub use content_extraction::HeuristicExtractor;
pub use content_extraction::MultiStrategyExtractor;
pub use content_extraction::ReadabilityExtractor;
pub use extractor::create_extractor;
pub use extractor::DefaultExtractor;
pub use extractor::Extractor;
pub use parser::parse_html;
pub use parser::HtmlParser;
pub use parser::Parser;
pub use config::config_manager::ConfigFormat;
pub use config::config_manager::ConfigManager;
pub use config::config_models::PluginConfig;
pub use config::config_models::ProfileConfig;
pub use config::config_models::Verbosity;
pub use config::enhanced_models::EnhancedScoringProfile;
pub use config::profile_modifier::ProfileModifier;
pub use config::templates::ProfileTemplates;
pub use content::create_basic_content_processor;
pub use content::create_content_processor;
pub use content::ContentProcessor;
pub use content::DefaultContentProcessor;
pub use fetcher::FetchOptions;
pub use fetcher::RetryConfig;
pub use fetcher::WebFetcher;
pub use scoring::ContentValidator;
pub use scoring::Phase3ScoringSystem;
pub use scoring::ProfileAwareScorer;
pub use scoring::ProfileCompiler;
pub use utils::json_optimizer::FieldSelector;
pub use utils::json_optimizer::FieldSelectorBuilder;
pub use utils::json_optimizer::OptimizedSerializer;
pub use utils::json_optimizer::SerializationOptions;
Modules§
- analysis
- Analysis module for webpage quality assessment
- async_
runtime - Async runtime abstraction layer
- config
- constants
- String constants used throughout the application Centralizing these avoids repeated allocations
- content
- content_
extraction - Positive Content Selection System
- extractor
- fetcher
- Web fetching module for URL-based analysis
- metrics
- models
- parser
- scoring
- test_
fixtures - Centralized test fixtures and HTML constants
- utils
Structs§
- Analyzer
- Analyzer
Builder - Builder for configuring the analyzer
Constants§
- VERSION
- Library version
Functions§
- analyze
- Level 1 - Simple Usage: Primary entry point for webpage quality analysis
- analyze_
batch_ high_ performance - High-Performance Batch Processing: Analyze multiple URLs concurrently
- analyze_
with_ profile - Level 1 - Simple Usage: Profile-specific analysis for different content types
- from_
config_ file - Level 3 - Advanced Setup: Create analyzer from configuration file