Webpage Quality Analyzer
High-performance webpage quality analyzer with 115 comprehensive metrics. Analyze web pages for SEO, content quality, technical standards, accessibility, and more - all in milliseconds.
๐ Features
- 115 Comprehensive Metrics (92 HTML-based + 23 network-based) across 7 major categories (Content, SEO, Technical, Accessibility, and more)
- 8 Built-in Profiles optimized for different page types (news, blog, product, portfolio, etc.)
- Multi-Platform Support: Native Rust, WebAssembly (browser/Node.js), C++ FFI
- High Performance: 100+ pages/second batch processing with parallel analysis
- Advanced Customization: Metric weights, thresholds, penalties, bonuses, and field selectors
- Profile-Aware Scoring: Phase 3-6 implementation with category-based weighted scoring
- Output Optimization: Field selection with up to 98.8% size reduction
- Production Ready: Battle-tested, 40+ test files, extensive documentation
๐ฆ Installation
Add this to your Cargo.toml:
[]
= "1.0"
Or use cargo:
๐ฏ Quick Start
Level 1: Simple Usage
use ;
async
Level 2: Builder Pattern
use Analyzer;
async
Level 3: Advanced Configuration
use ;
async
Analyzing HTML Directly
let html = r#"
<!DOCTYPE html>
<html lang="en">
<head>
<title>Sample Page</title>
<meta name="description" content="A sample page">
</head>
<body>
<h1>Welcome</h1>
<p>This is a test page with some content.</p>
</body>
</html>
"#;
let report = analyze.await?;
println!;
๐ Metrics Categories
All 115 metrics (92 HTML-based + 23 network-based):
Major Categories (7 total)
- Content (11 metrics) - Word count, readability (Flesch-Kincaid), text quality, content density
- SEO (9 metrics) - Meta tags, Open Graph, structured data, canonical URLs
- Technical (6 metrics) - HTML size, scripts, styles, validation
- Semantic (4 metrics) - Heading hierarchy, heading length, heading distribution
- Accessibility (7 metrics) - WCAG compliance, ARIA labels, contrast, alt text
- Network (23 metrics) - Performance (LCP, FCP), Security (HTTPS, CSP), Analytics
- Miscellaneous (55 metrics) - Links (8), Media (8), Forms (6), Structure (5), UX (5), Mobile (4), Branding (4), Structured Data (4), Business (3), Authority (3), Error (3), Internationalization (2)
Metric Distribution
- 92 metrics (80%) - HTML-only, no network required (WASM-compatible)
- 23 metrics (20%) - Network-required (when fetching URLs, server-side only)
See: Complete metrics breakdown
๐จ Available Profiles
Choose the right profile for your page type (8 built-in profiles):
| Profile | Best For | Content Weight | Key Focus |
|---|---|---|---|
content_article |
Long-form articles | 80% | Word count, structure, comprehensiveness |
blog |
Blog posts | 75% | Content quality, engagement, readability |
news |
News articles | 40% | Content freshness, readability, SEO (30%) |
general |
Any webpage | 35% | Balanced scoring across all categories |
homepage |
Landing pages | 25% | Navigation, structure, balanced (25% each) |
product |
Product pages | 20% | Media (35%), SEO (25%), product details |
portfolio |
Creative showcases | 15% | Media (50%), visual content |
login_page |
Authentication | 10% | Technical (50%), accessibility (20%), security |
Profile Customization: Each profile includes:
- Category weights (Content, SEO, Technical, Semantic, Accessibility)
- Content expectations (word count, headings, images)
- Metric overrides (custom weights and thresholds)
- Penalties (severe, moderate, light)
- Bonuses (excellence, achievement, synergy)
โ๏ธ Feature Flags
Control optional features via Cargo features:
[]
= { = "1.0", = ["async", "linkcheck", "nlp"] }
Available features:
async(default) - Async runtime with tokio + reqwestreadability(default) - Mozilla Readability content extractionlinkcheck- External link validationnlp- Language detection and Unicode segmentationgrammar- Grammar checking (via nlprule)wasm- WebAssembly bindings (mutually exclusive with async)ffi- C FFI for C++ integrationcli- Command-line tool binary
๐ Multi-Platform Support
WebAssembly (Browser/Node.js)
# Build for npm
# Use in JavaScript/TypeScript
import from '@webpage-quality-analyzer/core';
const analyzer = ;
const report = await analyzer.;
console.log;
C++ Integration
CAnalyzer* analyzer = ;
CReport* report = ;
double score = ;
Command-Line Tool
# Download binary from releases
๐ง Customization
Custom Metric Weights
let analyzer = builder
.with_profile_name?
.with_metric_weight? // Increase importance
.with_metric_weight? // Double weight
.build?;
Custom Thresholds
let analyzer = builder
.with_profile_name?
.set_metric_threshold?
.build?;
Custom Penalties & Bonuses
use ;
let analyzer = builder
.with_profile_name?
.add_penalty?
.add_bonus_above?
.build?;
Disable Metrics
let analyzer = builder
.with_profile_name?
.disable_metric?
.disable_metric?
.build?;
Output Customization (Phase 6)
// Full report (default)
let report = analyzer.run.await?;
// Compact JSON (20-30% size reduction)
let compact_json = analyzer.run_compact.await?;
// Minimal output (98.8% size reduction)
let minimal = analyzer.run_with_fields.await?;
// Advanced field selection
use FieldSelector;
let selector = builder
.include_sections
.exclude_section
.build;
let custom = analyzer.run_with_selector.await?;
๐ Performance
Analysis Speed:
- Single page (HTML-only): <100ms (typical), ~200ms (large docs)
- Single page (with network): ~300-500ms
- Batch processing: 180+ pages/second (HTML-only), 50+ pages/second (with network)
- Memory: Linear scaling, stable across repeated analyses
- Thread-safe: Fully concurrent with
Arc<Semaphore>control
Output Optimization:
- Full report: 30-50 KB (pretty), 12-18 KB (compact)
- Minimal output: 500 bytes (98.8% reduction)
- Custom fields: 300 bytes (3 fields)
// High-performance batch processing
use analyze_batch_high_performance;
let urls = vec!;
let json_results = analyze_batch_high_performance.await?;
๐ Documentation
๐งช Testing
๐ License
Dual licensed under MIT OR Apache-2.0. You can choose either license.
๐ค Contributing
Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.
๐ฆ Related Packages
- NPM:
@webpage-quality-analyzer/core- JavaScript/TypeScript (WASM) - CLI: Download binaries for Linux/Windows/macOS
- C++: Pre-compiled libraries with headers
- Python: Coming soon (PyO3 bindings)
๐ Why Choose This Analyzer?
- Comprehensive: 115 metrics across 20 categories covering all aspects of webpage quality
- Fast: Rust-powered performance, 180+ pages/sec batch processing
- Flexible: 8 profiles + full customization of weights, thresholds, penalties, bonuses
- Multi-Platform: Works everywhere - Native Rust, WASM (browser/Node.js), C++ FFI
- Production-Ready: 40+ test files, 279-line test README, extensive documentation
- Modern: profile-aware scoring, output optimization, field selectors
- Optimized: DOM caching, streaming serialization, 98.8% output size reduction
๐ Example Report
Made with โค๏ธ in Rust | Version 1.0.0 | October 2025