Expand description
§Webpage Quality Analyzer
Version: 1.0.2
A high-performance Rust crate for analyzing webpage quality with 3 simple usage levels:
What’s New in 1.0.2:
- Fixed
MetricEqualspenalty trigger implementation - Product profile reverted to e-commerce focus (breaking change - use “general” for software)
- Login page validation now stricter (Forms category weight: 40%)
- Homepage and General profiles more balanced with new bonuses
- Enhanced minimum baseline scoring with better documentation
§Quick Start (Level 1 - Simple)
use webpage_quality_analyzer::{analyze, analyze_with_profile};
// Analyze by fetching URL
let report = analyze("https://example.com", None).await?;
// Analyze provided HTML
let html = "<html><head><title>Test</title></head><body><p>Content</p></body></html>";
let report = analyze("https://example.com", Some(html)).await?;
// Use specific profile
let report = analyze_with_profile("https://example.com", None, "news").await?;§Custom Configuration (Level 2 - Builder)
use webpage_quality_analyzer::{Analyzer, async_runtime::DefaultRuntime};
let analyzer = Analyzer::<DefaultRuntime>::builder()
.with_profile_name("content_article")?
.enable_linkcheck(true)
.enable_nlp(true)
.build()?;
let report = analyzer.run("https://example.com", None).await?;§Advanced Setup (Level 3 - Config File)
use webpage_quality_analyzer::from_config_file;
let analyzer = from_config_file("my-config.yaml")?;
// Use analyzer.run() for analysisRe-exports§
pub use models::models::AnalysisMode;pub use models::models::AnalyzeError;pub use models::models::AppliedBonusInfo;pub use models::models::AppliedPenaltyInfo;pub use models::models::ContentChunk;pub use models::models::ContentChunkType;pub use models::models::ContentComplianceInfo;pub use models::models::ExtractedMetadata;pub use models::models::FormInfo;pub use models::models::Heading;pub use models::models::ImageInfo;pub use models::models::LinkInfo;pub use models::models::MediaInfo;pub use models::models::PageMetrics;pub use models::models::PageMetricsFromHTML;pub use models::models::PageMetricsFullFetch;pub use models::models::PageQualityReport;pub use models::models::Phase3ScoringDetails;pub use models::models::ProcessedDocument;pub use models::models::QualityBand;pub use models::models::Result;pub use models::models::StructuredData;pub use content_extraction::create_content_extractor;pub use content_extraction::create_heuristic_extractor;pub use content_extraction::create_readability_extractor;pub use content_extraction::ContentExtractor;pub use content_extraction::ContentScore;pub use content_extraction::ExtractionStrategy;pub use content_extraction::HeuristicExtractor;pub use content_extraction::MultiStrategyExtractor;pub use content_extraction::ReadabilityExtractor;pub use extractor::create_extractor;pub use extractor::DefaultExtractor;pub use extractor::Extractor;pub use parser::parse_html;pub use parser::HtmlParser;pub use parser::Parser;pub use config::config_manager::ConfigFormat;pub use config::config_manager::ConfigManager;pub use config::config_models::PluginConfig;pub use config::config_models::ProfileConfig;pub use config::config_models::Verbosity;pub use config::enhanced_models::EnhancedScoringProfile;pub use config::profile_modifier::ProfileModifier;pub use config::templates::ProfileTemplates;pub use content::create_basic_content_processor;pub use content::create_content_processor;pub use content::ContentProcessor;pub use content::DefaultContentProcessor;pub use fetcher::FetchOptions;pub use fetcher::RetryConfig;pub use fetcher::WebFetcher;pub use scoring::ContentValidator;pub use scoring::Phase3ScoringSystem;pub use scoring::ProfileAwareScorer;pub use scoring::ProfileCompiler;pub use utils::json_optimizer::FieldSelector;pub use utils::json_optimizer::FieldSelectorBuilder;pub use utils::json_optimizer::OptimizedSerializer;pub use utils::json_optimizer::SerializationOptions;
Modules§
- analysis
- Analysis module for webpage quality assessment
- async_
runtime - Async runtime abstraction layer
- config
- constants
- String constants used throughout the application Centralizing these avoids repeated allocations
- content
- content_
extraction - Positive Content Selection System
- extractor
- fetcher
- Web fetching module for URL-based analysis
- metrics
- models
- parser
- scoring
- test_
fixtures - Centralized test fixtures and HTML constants
- utils
Structs§
- Analyzer
- Analyzer
Builder - Builder for configuring the analyzer
Constants§
- VERSION
- Library version
Functions§
- analyze
- Level 1 - Simple Usage: Primary entry point for webpage quality analysis
- analyze_
batch_ high_ performance - High-Performance Batch Processing: Analyze multiple URLs concurrently
- analyze_
with_ profile - Level 1 - Simple Usage: Profile-specific analysis for different content types
- from_
config_ file - Level 3 - Advanced Setup: Create analyzer from configuration file