rswappalyzer
A high-performance Rust implementation of Wappalyzer, designed for fast and accurate website technology stack detection.
It leverages streaming HTML parsing and optimized pattern matching to deliver reliable detection capabilities for Rust-based web crawlers, security scanners, and monitoring systems.
高性能 Wappalyzer Rust 实现,用于快速、精准地检测网站技术栈。
基于流式 HTML 解析和优化的模式匹配技术,为 Rust 生态的爬虫、安全扫描器和监控系统提供可靠的技术栈识别能力。
Features | 功能
| Feature | Description | 核心功能 | 描述 |
|---|---|---|---|
| Streaming HTML Parsing with html5ever | Replaces regex-based tag extraction with the industry-standard html5ever HTML parser. Directly extracts script/meta tags from the streaming response body without loading the entire HTML into memory, making it ideal for large web pages and memory-constrained environments. | 基于 html5ever 的流式 HTML 解析 | 摒弃传统正则提取标签的方案,采用工业级 html5ever 解析器,无需加载完整 HTML 内容,即可从流式响应中精准提取 script/meta 标签,适配大网页解析与内存敏感场景。 |
| Concurrent-Safe Detection | The core detector is designed to be thread-safe (Send + Sync) and can be shared across multiple async tasks/threads. Enables efficient batch detection in distributed crawler clusters or multi-threaded scanning tools. | 并发安全的检测能力 | 核心检测器采用线程安全(Send + Sync)设计,可在多个异步任务 / 线程间共享,适配分布式爬虫集群、多线程扫描工具的批量检测需求。 |
| Comprehensive Detection | Identifies technologies from HTTP headers, HTML meta tags, script sources, and response bodies. | 全面检测能力 | 从 HTTP 响应头、HTML Meta 标签、Script 资源地址、响应正文等多维度识别技术栈。 |
| Seamless Integration for Rust Projects | Designed as a pure Rust library with no hidden external dependencies or system requirements. Exposes a clear, idiomatic API that fits naturally into asynchronous Rust workflows (e.g., tokio-based crawlers). Supports custom rule paths and proxy configurations for enterprise-level deployments. | Rust 项目无缝集成 | 纯 Rust 库设计,无隐藏外部依赖与系统级依赖,提供符合 Rust 习惯的简洁 API,可无缝融入异步 Rust 工作流(如 tokio 爬虫);支持自定义规则路径与代理配置,满足企业级部署需求。 |
Installation | 安装
Add this to your Cargo.toml:
cargo add rswappalyzer
Quick Start | 快速开始
Example 1 | 示例 1
use ;
use HeaderMap;
async
Example 2 | 示例 2
use ;
use HeaderMap;
use ;
use serde_json;
// 模拟业务扫描结果结构体(贴近实际开发场景)
async
API Overview | API 说明
| Function | Returns | Description | 返回值 | 描述 |
|---|---|---|---|---|
detect_technologies_wappalyzer |
Vec<Technology> |
Full detection with detailed version info, confidence, categories. | Vec<Technology> |
返回完整检测结果 |
detect_technologies_wappalyzer_lite |
Vec<TechnologyLite> |
Lightweight/faster detection, only name and confidence. | Vec<TechnologyLite> |
返回精简检测结果 |
Data Sources | 规则源
The following projects are used as rule sources:
-
WebAppAnalyzergo
https://github.com/projectdiscovery/wappalyzergo -
WebAppAnalyzer
https://github.com/enthec/webappanalyzer -
Wappalyzer (HTTPArchive)
https://github.com/HTTPArchive/wappalyzer
References | 参考项目
-
RustedWappalyzer
https://github.com/shart123456/RustedWappalyzer -
wappalyzergo
https://github.com/projectdiscovery/wappalyzergo
Author | 作者
FlyfishSec
License | 许可证
This project is licensed under the MIT License.
本项目基于 MIT 许可证 开源。