Webpage Quality Analyzer

High-performance webpage quality analyzer with 115 comprehensive metrics. Analyze web pages for SEO, content quality, technical standards, accessibility, and more - all in milliseconds.

🚀 Features

115 Comprehensive Metrics across 7 categories (Content, SEO, Technical, Semantic, Accessibility, Network, Engagement)
9 Pre-configured Profiles optimized for different page types (news, blog, ecommerce, etc.)
Multi-Platform Support: Native Rust, WebAssembly (browser/Node.js), C++ FFI, Python bindings
High Performance: 180+ pages/second batch processing with parallel analysis
Flexible Configuration: Custom profiles, metric weights, penalties, and bonuses
Production Ready: Battle-tested, comprehensive test suite, extensive documentation

📦 Installation

Add this to your Cargo.toml:

[dependencies]
webpage_quality_analyzer = "1.0"

Or use cargo:

cargo add webpage_quality_analyzer

🎯 Quick Start

Level 1: Simple Usage

use webpage_quality_analyzer::{analyze, analyze_with_profile};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Analyze with default settings
    let report = analyze("https://example.com", None).await?;
    
    println!("Score: {}/100", report.score);
    println!("Quality: {}", report.verdict);
    println!("Word Count: {}", report.metrics.content_metrics.word_count);
    
    // Analyze with specific profile
    let news_report = analyze_with_profile(
        "https://example.com",
        None,
        "news"
    ).await?;
    
    Ok(())
}

Level 2: Builder Pattern

use webpage_quality_analyzer::Analyzer;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Build custom analyzer
    let analyzer = Analyzer::builder()
        .with_profile_name("blog")?
        .with_metric_weight("word_count", 1.5)?
        .disable_metric("grammar_score")?
        .with_timeout_secs(30)?
        .build()?;
    
    let report = analyzer.run("https://example.com", None).await?;
    println!("Custom analysis score: {}", report.score);
    
    Ok(())
}

Level 3: Advanced Configuration

use webpage_quality_analyzer::Analyzer;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Load from YAML config file
    let analyzer = Analyzer::from_config_file("config.yaml").await?;
    
    // Batch analysis
    let urls = vec![
        "https://site1.com",
        "https://site2.com",
        "https://site3.com",
    ];
    
    let reports = analyzer.analyze_batch_urls(urls, 5).await?;
    
    for report in reports {
        println!("{}: {}/100", report.url, report.score);
    }
    
    Ok(())
}

Analyzing HTML Directly

let html = r#"
    <!DOCTYPE html>
    <html lang="en">
    <head>
        <title>Sample Page</title>
        <meta name="description" content="A sample page">
    </head>
    <body>
        <h1>Welcome</h1>
        <p>This is a test page with some content.</p>
    </body>
    </html>
"#;

let report = analyze("https://example.com", Some(html.to_string())).await?;
println!("HTML analysis score: {}", report.score);

📊 Metrics Categories

1. Content Metrics (20 metrics)

Word count, paragraph count, sentence complexity, content density, text-to-HTML ratio, content extraction quality, and more.

2. Technical Metrics (25 metrics)

Title length, meta description, heading structure, HTML validity, semantic elements, images (count, alt text, sizes), links, forms.

3. SEO Metrics (18 metrics)

Meta tags, Open Graph, Twitter Cards, canonical URLs, robots meta, schema.org structured data, sitemap links.

4. Semantic Metrics (15 metrics)

Heading hierarchy, ARIA labels, microdata, RDFa, JSON-LD, semantic HTML5 elements.

5. Accessibility Metrics (12 metrics)

ARIA attributes, roles, image alt text, form labels, color contrast, keyboard navigation.

6. Network Metrics (23 metrics)

Load time, TTFB, resource sizes (HTML, CSS, JS, images), HTTP status, redirects, compression, caching headers.

7. Engagement Metrics (2 metrics)

Interactive elements, CTAs, social sharing buttons.

🎨 Available Profiles

Choose the right profile for your page type:

Profile	Best For	Key Focus
`general`	Any webpage	Balanced scoring across all metrics
`news`	News articles	Content freshness, readability, structure
`blog`	Blog posts	Content quality, engagement, readability
`ecommerce`	Product pages	Conversion elements, images, CTAs
`content_article`	Long-form content	Word count, structure, comprehensiveness
`product`	Product landing pages	Product details, images, specifications
`portfolio`	Portfolio sites	Visual content, project showcases
`login_page`	Login/auth pages	Forms, security, minimal content
`homepage`	Homepage	Navigation, structure, key messages

⚙️ Feature Flags

Control optional features via Cargo features:

[dependencies]
webpage_quality_analyzer = { version = "1.0", features = ["async", "linkcheck", "nlp"] }

Available features:

async (default) - Async runtime with tokio + reqwest
readability (default) - Mozilla Readability content extraction
linkcheck - External link validation
nlp - Language detection and Unicode segmentation
grammar - Grammar checking (via nlprule)
wasm - WebAssembly bindings (mutually exclusive with async)
ffi - C FFI for C++ integration
cli - Command-line tool binary

🌐 Multi-Platform Support

WebAssembly (Browser/Node.js)

# Build for npm
wasm-pack build --target bundler --no-default-features --features wasm

# Use in JavaScript/TypeScript
npm install @webpage-quality-analyzer/core

import { WasmAnalyzer } from '@webpage-quality-analyzer/core';

const analyzer = new WasmAnalyzer();
const report = await analyzer.analyze('<html>...</html>');
console.log(`Score: ${report.score}/100`);

C++ Integration

#include "webpage_quality_analyzer.hpp"

CAnalyzer* analyzer = wqa_analyzer_new();
CReport* report = wqa_analyze(analyzer, "https://example.com", nullptr);
double score = wqa_report_get_score(report);

Command-Line Tool

# Download binary from releases
wqa analyze https://example.com
wqa batch urls.txt --parallel 10
wqa profiles  # List available profiles

🔧 Customization

Custom Metric Weights

let analyzer = Analyzer::builder()
    .with_profile_name("blog")?
    .with_metric_weight("word_count", 1.5)?       // Increase importance
    .with_metric_weight("readability_score", 2.0)? // Double weight
    .build()?;

Custom Penalties & Bonuses

let analyzer = Analyzer::builder()
    .with_profile_name("news")?
    // Penalty: -5 points if word count < 500
    .add_penalty_below("word_count", 500.0, 5.0)?
    // Bonus: +3 points if word count > 2000
    .add_bonus_above("word_count", 2000.0, 3.0)?
    .build()?;

Disable Metrics

let analyzer = Analyzer::builder()
    .with_profile_name("general")?
    .disable_metric("grammar_score")?     // Skip grammar analysis
    .disable_metric("language_detection")? // Skip language detection
    .build()?;

Output Customization

// Compact JSON (98.8% size reduction)
let compact = analyzer.run_compact(url, html).await?;

// Select specific fields
let minimal = analyzer.run_with_fields(
    url, 
    html,
    vec!["score", "verdict", "word_count"]
).await?;

📈 Performance

Single page: < 1 second (typical)
Batch processing: 180+ pages/second with parallel analysis
Memory: ~50-100 MB per analyzer instance
Thread-safe: Each analyzer instance is thread-safe

// High-performance batch processing
use webpage_quality_analyzer::analyze_batch_high_performance;

let urls = vec![/* ... 100 URLs ... */];
let reports = analyze_batch_high_performance(urls, 10).await?; // 10 concurrent

📚 Documentation

🧪 Testing

cargo test                              # Run all tests
cargo test --features linkcheck         # With network features
cargo bench                             # Run benchmarks

📄 License

Dual licensed under MIT OR Apache-2.0. You can choose either license.

🤝 Contributing

Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.

📦 Related Packages

NPM: @webpage-quality-analyzer/core - JavaScript/TypeScript (WASM)
CLI: Download binaries for Linux/Windows/macOS
C++: Pre-compiled libraries with headers
Python: Coming soon (PyO3 bindings)

🌟 Why Choose This Analyzer?

Comprehensive: 115 metrics covering all aspects of webpage quality
Fast: Rust-powered performance, 180+ pages/sec batch processing
Flexible: 9 profiles + full customization of weights, penalties, bonuses
Multi-Platform: Works everywhere - Rust, WASM, C++, CLI
Production-Ready: Extensive tests, documentation, real-world usage
Modern: Async/await, latest Rust features, clean API design

📊 Example Report

{
  "score": 87.5,
  "verdict": "Excellent",
  "url": "https://example.com",
  "metrics": {
    "content_metrics": {
      "word_count": 1250,
      "paragraph_count": 15,
      "avg_sentence_length": 18.5
    },
    "technical_metrics": {
      "title_length": 55,
      "has_meta_description": true,
      "image_count": 8
    },
    "seo_metrics": {
      "has_og_tags": true,
      "has_schema_org": true
    }
  }
}

Made with ❤️ in Rust | Version 1.0.0 | October 2025

webpage_quality_analyzer 1.0.0