webpage_quality_analyzer 1.0.1

High-performance webpage quality analyzer with 115 comprehensive metrics - Rust library with WASM, C++, and Python bindings
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
# ๐Ÿ” Webpage Quality Analyzer

> **High-performance webpage quality analysis with 115 comprehensive metrics across 8 built-in profiles**

[![Crates.io](https://img.shields.io/crates/v/webpage_quality_analyzer)](https://crates.io/crates/webpage_quality_analyzer)
[![docs.rs](https://docs.rs/webpage_quality_analyzer/badge.svg)](https://docs.rs/webpage_quality_analyzer)
[![npm](https://img.shields.io/npm/v/@webpage-quality-analyzer/core)](https://www.npmjs.com/package/@webpage-quality-analyzer/core)
[![License](https://img.shields.io/badge/license-MIT%2FApache--2.0-blue)](LICENSE)
[![Build Status](https://img.shields.io/github/actions/workflow/status/NotGyashu/webpage-quality-analyser/ci.yml?branch=main)](https://github.com/NotGyashu/webpage-quality-analyser/actions)

A blazing-fast, multi-platform library for analyzing webpage quality with **115 metrics** (92 HTML-based + 23 network-based) organized across **7 categories**, with **8 professionally-tuned profiles** for different page types.

---

## โœจ Features

- **๐Ÿš€ High Performance**: Analyze typical webpages in <100ms, large pages in <1s
- **๐Ÿ“Š 115 Comprehensive Metrics**: Content, SEO, Performance, Accessibility, and more
- **๐ŸŽฏ 8 Built-in Profiles**: Optimized for news, blogs, products, portfolios, etc.
- **๐ŸŒ Multi-Platform**: Rust, WebAssembly (Browser/Node.js), C++, CLI tool
- **โšก Parallel Batch Processing**: 180+ pages/second with concurrent analysis
- **๐ŸŽจ Customizable Scoring**: Adjust weights, thresholds, penalties, bonuses
- **๐Ÿ“ฑ Mobile-Friendly**: Responsive design and mobile usability metrics
- **โ™ฟ Accessibility**: WCAG 2.1 AA/AAA compliance checking
- **๐Ÿ”’ Security**: HTTPS, CSP, HSTS, XSS protection validation
- **๐Ÿ“ˆ Real-time Analysis**: No external API calls, runs entirely locally

---

## ๐Ÿš€ Quick Start

### Rust

```toml
[dependencies]
webpage_quality_analyzer = "1.0.0"
```

```rust
use webpage_quality_analyzer::analyze;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let report = analyze("https://example.com", None).await?;
    println!("Score: {}/100, Quality: {}", report.score, report.verdict);
    Ok(())
}
```

### JavaScript/TypeScript (WASM)

```bash
npm install @webpage-quality-analyzer/core
```

```javascript
import init, { WasmAnalyzer } from '@webpage-quality-analyzer/core';

await init();
const analyzer = new WasmAnalyzer();
const report = await analyzer.analyze('<html>...</html>');
console.log(`Score: ${report.score}/100`);
```

### C++

```bash
wget https://github.com/NotGyashu/webpage-quality-analyser/releases/download/v1.0.0/cpp-package-v1.0.0-linux-x64.tar.gz
tar -xzf cpp-package-v1.0.0-linux-x64.tar.gz
```

```cpp
#include "webpage_quality_analyzer/webpage_quality_analyzer.hpp"

int main() {
    auto report = wqa::Analyzer::analyze("https://example.com");
    std::cout << "Score: " << report.score() << "/100" << std::endl;
    return 0;
}
```

### CLI Tool

```bash
# Linux/macOS
wget https://github.com/NotGyashu/webpage-quality-analyser/releases/download/v1.0.0/wqa-cli-v1.0.0-linux-x64.tar.gz
tar -xzf wqa-cli-v1.0.0-linux-x64.tar.gz
sudo mv wqa /usr/local/bin/

# Analyze a webpage
wqa analyze --url https://example.com --profile news --output report.json
```

---

## ๐Ÿ“š Documentation

| Resource | Description |
|----------|-------------|
| **[๐Ÿ“– Documentation Index]DOCUMENTATION_INDEX.md** | Complete documentation hub |
| **[๐Ÿš€ Installation Guide]docs/getting-started/INSTALLATION.md** | Platform-specific setup |
| **[๐ŸŽฏ First Analysis Tutorial]docs/getting-started/FIRST_ANALYSIS.md** | 5-minute quick start |
| **[๐Ÿ“Š Understanding Metrics]docs/getting-started/UNDERSTANDING_METRICS.md** | All 115 metrics explained |
| **[๐Ÿ—๏ธ Build & Release Guide]docs/guides/BUILD_AND_RELEASE_GUIDE.md** | Complete build workflows |
| **[๐Ÿ”ง API Reference]docs/api-reference/** | Complete API documentation |
| **[๐Ÿ’ก Examples]examples/** | 40+ working code examples |

---

## ๐ŸŽฏ Platform Support Matrix

| Platform | Metrics | Async | Batch | Status |
|----------|---------|-------|-------|--------|
| **Rust Library** | All 115 | โœ… Tokio | โœ… Yes | ๐ŸŸข Production |
| **WASM/Browser** | 92 HTML | โœ… Promise | โœ… Yes | ๐ŸŸข Production |
| **C++ FFI** | All 115 | โŒ Blocking | โœ… Yes | ๐ŸŸข Production |
| **CLI Tool** | All 115 | โœ… Tokio | โœ… Yes | ๐ŸŸข Production |
| **Python** | All 115 | ๐Ÿšง Planned | ๐Ÿšง Planned | ๐ŸŸก Roadmap |

**Note**: WASM provides 92 HTML-based metrics (network metrics require server-side fetching).

---

## ๐Ÿ“Š Metrics Overview

### 115 Total Metrics

```
HTML-Based Metrics (92)          Network-Based Metrics (23)
โ”œโ”€โ”€ Content (11)                 โ”œโ”€โ”€ Performance (11)
โ”‚   โ”œโ”€โ”€ Word count              โ”‚   โ”œโ”€โ”€ Largest Contentful Paint
โ”‚   โ”œโ”€โ”€ Readability             โ”‚   โ”œโ”€โ”€ First Contentful Paint
โ”‚   โ”œโ”€โ”€ Sentence count          โ”‚   โ”œโ”€โ”€ Time to Interactive
โ”‚   โ””โ”€โ”€ ...                     โ”‚   โ””โ”€โ”€ ...
โ”œโ”€โ”€ Structure (5)                โ”œโ”€โ”€ Security (6)
โ”œโ”€โ”€ Media (8)                    โ”‚   โ”œโ”€โ”€ HTTPS enabled
โ”œโ”€โ”€ SEO (9)                      โ”‚   โ”œโ”€โ”€ CSP header
โ”œโ”€โ”€ Links (8)                    โ”‚   โ””โ”€โ”€ ...
โ”œโ”€โ”€ Technical (6)                โ”œโ”€โ”€ Analytics (3)
โ”œโ”€โ”€ Accessibility (7)            โ””โ”€โ”€ Error Handling (3)
โ”œโ”€โ”€ Mobile (4)
โ”œโ”€โ”€ Authority (3)
โ”œโ”€โ”€ Forms (6)
โ”œโ”€โ”€ Structured Data (4)
โ”œโ”€โ”€ Branding (4)
โ”œโ”€โ”€ User Experience (5)
โ”œโ”€โ”€ Business (3)
โ””โ”€โ”€ Internationalization (2)
```

**See detailed breakdown**: [Metrics Reference โ†’](docs/getting-started/UNDERSTANDING_METRICS.md)

---

## ๐ŸŽจ Built-in Profiles

Choose the right profile for your page type to get optimized scoring:

| Profile | Best For | Content Weight | SEO Weight | Key Metrics |
|---------|----------|----------------|------------|-------------|
| **news** | News articles | 35% | 25% | Freshness, metadata, social sharing |
| **blog** | Blog posts | 30% | 25% | Readability, structure, engagement |
| **product** | E-commerce | 20% | 30% | Images, structured data, conversion |
| **portfolio** | Personal sites | 25% | 20% | Visual design, branding, projects |
| **content_article** | Long-form content | 40% | 20% | Word count, depth, citations |
| **login_page** | Authentication | 10% | 5% | Security, forms, usability |
| **homepage** | Site homepages | 20% | 30% | Navigation, branding, performance |
| **general** | Default | 30% | 25% | Balanced across all categories |

**Learn more**: [Choosing Profiles โ†’](docs/getting-started/CHOOSING_PROFILES.md)

---

## ๐ŸŒŸ Key Features

### 3-Tier API Design

**Level 1: Simple** - One function call
```rust
let report = analyze("https://example.com", None).await?;
```

**Level 2: Builder** - Custom configuration
```rust
let analyzer = Analyzer::<DefaultRuntime>::builder()
    .with_profile_name("news")?
    .enable_linkcheck(true)
    .build()?;
```

**Level 3: Config Files** - YAML/JSON/TOML
```rust
let analyzer = from_config_file("config.yaml")?;
```

### Advanced Customization

**Adjust Metric Weights**:
```rust
analyzer.with_metric_weight("word_count", 1.5)?
        .with_metric_weight("readability_score", 2.0)?
```

**Custom Penalties & Bonuses**:
```rust
analyzer.add_penalty_below("word_count", 300.0, 10.0)?
        .add_bonus_above("readability_score", 80.0, 5.0)?
```

**Output Filtering** (98.8% size reduction):
```rust
let compact_report = analyzer.run_compact(url, html).await?;
```

### Batch Processing

**Parallel Analysis** (up to 180+ pages/second):
```rust
let reports = analyze_batch_urls_parallel(urls, None).await?;
```

**High-Performance Mode**:
```rust
let reports = analyze_batch_high_performance(urls).await?;
```

---

## ๐Ÿ”ง Use Cases

- **๐Ÿ” SEO Auditing**: Analyze title, meta tags, headings, structured data
- **๐Ÿ“ Content Quality**: Measure readability, word count, structure
- **โ™ฟ Accessibility**: Check WCAG compliance, ARIA labels, contrast
- **โšก Performance**: Track Core Web Vitals, page size, load times
- **๐Ÿค– CI/CD Integration**: Automated quality checks in build pipelines
- **๐Ÿ“Š Competitive Analysis**: Compare your pages against competitors
- **๐Ÿšจ Monitoring**: Track quality metrics over time
- **๐ŸŽฏ A/B Testing**: Measure quality impact of design changes

---

## ๐Ÿ“ˆ Performance

Benchmarked on typical webpages:

| Page Size | Metrics | Time | Memory | Throughput |
|-----------|---------|------|--------|------------|
| Small (<10KB) | 92 HTML | <100ms | <10MB | 200+ pages/s |
| Medium (50KB) | 115 Full | <500ms | <15MB | 150+ pages/s |
| Large (100KB+) | 115 Full | <1000ms | <20MB | 100+ pages/s |

**Optimizations**:
- โœ… DOM caching (elements cached, reused 115 times)
- โœ… Connection pooling (persistent HTTP connections)
- โœ… Parallel batch processing (Arc<Semaphore>, max 10 concurrent)
- โœ… Zero-copy metric scorers (Arc<dyn MetricScorer>)
- โœ… Optimized JSON serialization (field selectors)

---

## ๐Ÿ› ๏ธ Development

### Building from Source

```bash
# Clone repository
git clone https://github.com/NotGyashu/webpage-quality-analyser.git
cd webpage-quality-analyser

# Build Rust library
cargo build --release

# Build WASM
wasm-pack build --target bundler --no-default-features --features wasm

# Build C++ bindings
cargo build --release --features ffi
./scripts/build_ffi.sh

# Build CLI tool
cargo build --release --bin wqa --features cli

# Run tests
cargo test --all-features

# Run benchmarks
cargo bench
```

### Testing

```bash
# All tests (40+ test files)
cargo test

# Specific test suites
cargo test comprehensive_metrics   # 115-metric validation
cargo test phase3                  # Profile-aware scoring
cargo test output_customization    # Field selectors
cargo test weight_customization    # Metric weight adjustment

# WASM tests
wasm-pack test --headless --firefox

# C++ examples
cd build && ./bindings/cpp/examples/level1_simple
```

**See**: [Development Guide โ†’](docs/development/setup.md)

---

## ๐Ÿค Contributing

We welcome contributions! See [CONTRIBUTING.md](docs/contributing.md) for guidelines.

### Areas for Contribution

- ๐Ÿ› Bug fixes and issue reporting
- ๐Ÿ“š Documentation improvements
- ๐ŸŒ New language bindings (Python, Go, etc.)
- ๐Ÿ“Š Additional metrics and profiles
- โšก Performance optimizations
- ๐Ÿงช Test coverage expansion

---

## ๐Ÿ“ License

Licensed under either of:

- **Apache License, Version 2.0** ([LICENSE-APACHE]LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- **MIT License** ([LICENSE-MIT]LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

---

## ๐Ÿ”— Links

- **Documentation**: [docs.rs/webpage_quality_analyzer]https://docs.rs/webpage_quality_analyzer
- **Crates.io**: [crates.io/crates/webpage_quality_analyzer]https://crates.io/crates/webpage_quality_analyzer
- **npm**: [npmjs.com/package/@webpage-quality-analyzer/core]https://www.npmjs.com/package/@webpage-quality-analyzer/core
- **GitHub**: [github.com/NotGyashu/webpage-quality-analyser]https://github.com/NotGyashu/webpage-quality-analyser
- **Issues**: [github.com/NotGyashu/webpage-quality-analyser/issues]https://github.com/NotGyashu/webpage-quality-analyser/issues
- **Discussions**: [github.com/NotGyashu/webpage-quality-analyser/discussions]https://github.com/NotGyashu/webpage-quality-analyser/discussions

---

## ๐Ÿ™ Acknowledgments

Built with:
- [Rust]https://www.rust-lang.org/ - Systems programming language
- [Tokio]https://tokio.rs/ - Async runtime
- [wasm-bindgen]https://rustwasm.github.io/wasm-bindgen/ - WASM bindings
- [tl]https://crates.io/crates/tl - HTML parsing
- [readability]https://crates.io/crates/readability - Content extraction

Special thanks to the Rust community and all contributors!

---

## ๐Ÿ“Š Project Stats

- **Lines of Code**: ~25,000+ (Rust)
- **Test Coverage**: 40+ test files, 279+ tests
- **Benchmarks**: 15+ performance benchmarks
- **Documentation**: 50+ markdown files
- **Examples**: 40+ working examples
- **Supported Platforms**: 4 (Rust, WASM, C++, CLI)

---

## ๐ŸŽฏ Roadmap

### v1.1.0 (Q1 2026)
- [ ] Python bindings (PyO3)
- [ ] Enhanced NLP features
- [ ] Real-time browser extension
- [ ] Cloud API service

### v1.2.0 (Q2 2026)
- [ ] Machine learning-based scoring
- [ ] Historical trend analysis
- [ ] Competitive benchmarking database
- [ ] Advanced visualization tools

**See complete roadmap**: [docs/ROADMAP.md](docs/ROADMAP.md)

---

## โญ Star History

If you find this project useful, please consider giving it a star! โญ

---

**Made with โค๏ธ by [@NotGyashu](https://github.com/NotGyashu) and [contributors](https://github.com/NotGyashu/webpage-quality-analyser/graphs/contributors)**

**Last Updated**: October 9, 2025 | **Version**: 1.0.0 | **Status**: Production Ready

---

**Navigation**: [Documentation Index โ†’](DOCUMENTATION_INDEX.md) | [Installation Guide โ†’](docs/getting-started/INSTALLATION.md) | [Quick Start Tutorial โ†’](docs/getting-started/FIRST_ANALYSIS.md)