A3S Search
Overview
A3S Search is an embeddable meta search engine library inspired by SearXNG. It aggregates search results from multiple search engines, deduplicates them, and ranks them using a consensus-based scoring algorithm.
Basic Usage
use ;
async
Features
- Multi-Engine Search: Aggregate results from multiple search engines in parallel
- Result Deduplication: Merge duplicate results based on normalized URLs
- Consensus Ranking: Results found by multiple engines rank higher
- Configurable Weights: Adjust engine influence on final rankings
- Async-First: Built on Tokio for high-performance concurrent searches
- Timeout Handling: Per-engine timeout with graceful degradation
- Extensible: Easy to add custom search engines via the
Enginetrait - Proxy Pool: Dynamic proxy IP rotation to avoid anti-crawler blocking
- Headless Browser: Optional Chrome/Chromium integration for JS-rendered engines (feature-gated)
- Auto-Install Chrome: Automatically detects or downloads Chrome for Testing when no browser is found
- PageFetcher Abstraction: Pluggable page fetching (plain HTTP or headless browser)
- CLI Tool: Command-line interface for quick searches
CLI Usage
Installation
Homebrew (macOS):
Cargo:
Commands
# Basic search (uses DuckDuckGo and Wikipedia by default)
# Search with specific engines
# Search with Google (Chrome auto-installed if needed)
# Search with Chinese headless engines
# Limit results
# JSON output
# Compact output (tab-separated)
# Use proxy
# SOCKS5 proxy
# Verbose mode
# List available engines
Available Engines
| Shortcut | Engine | Description |
|---|---|---|
ddg |
DuckDuckGo | Privacy-focused search |
brave |
Brave | Brave Search |
wiki |
Wikipedia | Wikipedia API |
sogou |
Sogou | 搜狗搜索 |
360 |
360 Search | 360搜索 |
g |
Google Search (Chrome auto-installed) | |
baidu |
Baidu | 百度搜索 (Chrome auto-installed) |
bing_cn |
Bing China | 必应中国 (Chrome auto-installed) |
Supported Search Engines
International Engines
| Engine | Shortcut | Description |
|---|---|---|
| DuckDuckGo | ddg |
Privacy-focused search |
| Brave | brave |
Brave Search |
| Wikipedia | wiki |
Wikipedia API |
g |
Google Search (headless browser) |
Chinese Engines (中国搜索引擎)
| Engine | Shortcut | Description |
|---|---|---|
| Sogou | sogou |
搜狗搜索 |
| So360 | 360 |
360搜索 |
| Baidu | baidu |
百度搜索 (headless browser) |
| Bing China | bing_cn |
必应中国 (headless browser) |
Automatic Chrome Setup
When using headless engines (g, baidu, bing_cn), Chrome/Chromium is required. A3S Search handles this automatically:
- Detect — Checks
CHROMEenv var, PATH commands, and well-known install paths - Cache — Looks for a previously downloaded Chrome in
~/.a3s/chromium/ - Download — If not found, downloads Chrome for Testing from Google's official CDN
Supported platforms: macOS (arm64, x64) and Linux (x64).
# First run: Chrome is auto-downloaded if not installed
# Fetching Chrome for Testing version info...
# Downloading Chrome for Testing v145.0.7632.46 (mac-arm64)...
# Downloaded 150.2 MB, extracting...
# Chrome for Testing v145.0.7632.46 installed successfully!
# Subsequent runs: uses cached Chrome instantly
# Or set CHROME env var to use a specific binary
CHROME=/usr/bin/chromium
Quality Metrics
Test Coverage
251 comprehensive unit tests with 91.15% line coverage:
| Module | Lines | Coverage | Functions | Coverage |
|---|---|---|---|---|
| engine.rs | 116 | 100.00% | 17 | 100.00% |
| error.rs | 52 | 100.00% | 10 | 100.00% |
| query.rs | 114 | 100.00% | 20 | 100.00% |
| result.rs | 194 | 100.00% | 35 | 100.00% |
| aggregator.rs | 292 | 100.00% | 30 | 100.00% |
| search.rs | 337 | 99.41% | 58 | 100.00% |
| proxy.rs | 410 | 99.02% | 91 | 96.70% |
| engines/duckduckgo.rs | 236 | 97.46% | 27 | 81.48% |
| engines/bing_china.rs | 164 | 96.95% | 18 | 77.78% |
| engines/baidu.rs | 146 | 96.58% | 17 | 76.47% |
| engines/google.rs | 180 | 96.11% | 19 | 73.68% |
| engines/brave.rs | 140 | 95.71% | 20 | 75.00% |
| engines/so360.rs | 132 | 95.45% | 18 | 77.78% |
| engines/sogou.rs | 131 | 95.42% | 17 | 76.47% |
| fetcher_http.rs | 29 | 93.10% | 7 | 85.71% |
| fetcher.rs | 73 | 93.15% | 10 | 100.00% |
| engines/wikipedia.rs | 153 | 90.85% | 26 | 88.46% |
| browser.rs | 244 | 68.85% | 42 | 61.90% |
| browser_setup.rs | 406 | 58.13% | 65 | 49.23% |
| TOTAL | 3549 | 91.15% | 547 | 84.10% |
Note: browser.rs and browser_setup.rs have lower coverage because BrowserPool::acquire_browser(), BrowserFetcher::fetch(), and download_chrome() require a running Chrome process or network access. Integration tests verify real browser functionality but are #[ignore] by default.
Run coverage report:
# Default (19 modules, 251 tests, 91.15% coverage)
# Without headless (14 modules)
# Detailed file-by-file table
# HTML report (opens in browser)
Running Tests
# Default build (8 engines, 251 tests)
# Without headless (5 engines)
# Integration tests (requires network + Chrome for Google)
# With progress display (via justfile)
Architecture
Ranking Algorithm
The scoring algorithm is based on SearXNG's approach:
score = Σ (weight / position) for each engine
weight = engine_weight × num_engines_found
Key factors:
- Engine Weight: Configurable per-engine multiplier (default: 1.0)
- Consensus: Results found by multiple engines score higher
- Position: Earlier positions in individual engines score higher
Components
┌─────────────────────────────────────────────────────┐
│ Search │
│ ┌───────────────────────────────────────────────┐ │
│ │ Engine Registry │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │DuckDuck │ │ Brave │ │Wikipedia│ ... │ │
│ │ │ Go │ │ │ │ │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ │ │
│ │ ┌─────────────────────────────────┐ │ │
│ │ │ Google (headless browser) │ │ │
│ │ │ └─ PageFetcher → BrowserPool │ │ │
│ │ └─────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────┘ │
│ ↓ parallel search │
│ ┌───────────────────────────────────────────────┐ │
│ │ Aggregator │ │
│ │ • Deduplicate by normalized URL │ │
│ │ • Merge results from multiple engines │ │
│ │ • Calculate consensus-based scores │ │
│ │ • Sort by score (descending) │ │
│ └───────────────────────────────────────────────┘ │
│ ↓ │
│ SearchResults │
└─────────────────────────────────────────────────────┘
PageFetcher (trait)
├── HttpFetcher (reqwest, plain HTTP)
└── BrowserFetcher (chromiumoxide, headless Chrome)
└── BrowserPool (shared process, tab semaphore)
Quick Start
Installation
Add to your Cargo.toml:
[]
= "0.5"
= { = "1", = ["full"] }
# To disable headless browser support:
# a3s-search = { version = "0.5", default-features = false }
Basic Search
use ;
let mut search = new;
search.add_engine;
let query = new;
let results = search.search.await?;
println!;
Chinese Search (中文搜索)
use ;
let mut search = new;
search.add_engine; // 搜狗
search.add_engine; // 360搜索
let query = new;
let results = search.search.await?;
Query Options
use ;
let query = new
.with_categories
.with_language
.with_safesearch
.with_page
.with_time_range;
Custom Engine Weights
use ;
// Wikipedia results will have 1.5x weight
let wiki = new.with_config;
let mut search = new;
search.add_engine;
Using Proxy Pool (Anti-Crawler Protection)
use ;
use ;
// Create a proxy pool with multiple proxies
let proxy_pool = with_proxies.with_strategy;
let mut search = new;
search.set_proxy_pool;
search.add_engine;
let query = new;
let results = search.search.await?;
Dynamic Proxy Provider
use ;
use async_trait;
use Duration;
// Implement custom proxy provider (e.g., from API)
// Use with proxy pool
let provider = MyProxyProvider ;
let proxy_pool = with_provider;
proxy_pool.refresh.await?; // Initial fetch
Implementing Custom Engines
use ;
use async_trait;
API Reference
Search
| Method | Description |
|---|---|
new() |
Create a new search instance |
add_engine(engine) |
Add a search engine |
set_timeout(duration) |
Set default search timeout |
engine_count() |
Get number of configured engines |
search(query) |
Perform a search |
set_proxy_pool(pool) |
Set proxy pool for anti-crawler |
proxy_pool() |
Get reference to proxy pool |
SearchQuery
| Method | Description |
|---|---|
new(query) |
Create a new query |
with_categories(cats) |
Set target categories |
with_language(lang) |
Set language/locale |
with_safesearch(level) |
Set safe search level |
with_page(page) |
Set page number |
with_time_range(range) |
Set time range filter |
with_engines(engines) |
Limit to specific engines |
SearchResult
| Field | Type | Description |
|---|---|---|
url |
String |
Result URL |
title |
String |
Result title |
content |
String |
Result snippet |
result_type |
ResultType |
Type of result |
engines |
HashSet<String> |
Engines that found this |
positions |
Vec<u32> |
Positions in each engine |
score |
f64 |
Calculated ranking score |
thumbnail |
Option<String> |
Thumbnail URL |
published_date |
Option<String> |
Publication date |
SearchResults
| Method | Description |
|---|---|
items() |
Get result slice |
suggestions() |
Get query suggestions |
answers() |
Get direct answers |
count |
Number of results |
duration_ms |
Search duration in ms |
Engine Trait
EngineConfig
| Field | Type | Default | Description |
|---|---|---|---|
name |
String |
- | Display name |
shortcut |
String |
- | Short identifier |
categories |
Vec<EngineCategory> |
[General] |
Categories |
weight |
f64 |
1.0 |
Ranking weight |
timeout |
u64 |
5 |
Timeout in seconds |
enabled |
bool |
true |
Is enabled |
paging |
bool |
false |
Supports pagination |
safesearch |
bool |
false |
Supports safe search |
ProxyPool
| Method | Description |
|---|---|
new() |
Create empty proxy pool (disabled) |
with_proxies(proxies) |
Create with static proxy list |
with_provider(provider) |
Create with dynamic provider |
with_strategy(strategy) |
Set selection strategy |
set_enabled(bool) |
Enable/disable proxy pool |
is_enabled() |
Check if enabled |
refresh() |
Refresh proxies from provider |
get_proxy() |
Get next proxy (based on strategy) |
add_proxy(proxy) |
Add a proxy to pool |
remove_proxy(host, port) |
Remove a proxy |
create_client(user_agent) |
Create HTTP client with proxy |
ProxyConfig
| Method | Description |
|---|---|
new(host, port) |
Create HTTP proxy config |
with_protocol(protocol) |
Set protocol (Http/Https/Socks5) |
with_auth(user, pass) |
Set authentication |
url() |
Get proxy URL string |
ProxyStrategy
| Variant | Description |
|---|---|
RoundRobin |
Rotate through proxies sequentially |
Random |
Select random proxy each time |
Development
Dependencies
| Dependency | Install | Purpose |
|---|---|---|
cargo-llvm-cov |
cargo install cargo-llvm-cov |
Code coverage (optional) |
lcov |
brew install lcov / apt install lcov |
Coverage report formatting (optional) |
| Chrome/Chromium | Auto-installed | For headless browser engines (auto-downloaded if not found) |
Build Commands
# Build (default, 8 engines including headless)
# Build without headless browser support (5 engines)
# Build release
# Test (with colored progress display)
# Test subsets
# Coverage (requires cargo-llvm-cov)
# Format & Lint
# Utilities
Project Structure
search/
├── Cargo.toml
├── justfile
├── README.md
├── examples/
│ ├── basic_search.rs # Basic usage example
│ └── chinese_search.rs # Chinese engines example
├── tests/
│ └── integration.rs # Integration tests (network-dependent)
└── src/
├── main.rs # CLI entry point
├── lib.rs # Library entry point
├── engine.rs # Engine trait and config
├── error.rs # Error types
├── query.rs # SearchQuery
├── result.rs # SearchResult, SearchResults
├── aggregator.rs # Result aggregation and ranking
├── search.rs # Search orchestrator
├── proxy.rs # Proxy pool and configuration
├── fetcher.rs # PageFetcher trait, WaitStrategy
├── fetcher_http.rs # HttpFetcher (reqwest wrapper)
├── browser.rs # BrowserPool, BrowserFetcher (headless browser)
├── browser_setup.rs # Chrome auto-detection and download
└── engines/
├── mod.rs # Engine exports
├── duckduckgo.rs # DuckDuckGo
├── brave.rs # Brave Search
├── google.rs # Google (headless browser)
├── wikipedia.rs # Wikipedia
├── baidu.rs # Baidu (百度, headless browser)
├── bing_china.rs # Bing China (必应中国, headless browser)
├── sogou.rs # Sogou (搜狗)
└── so360.rs # 360 Search (360搜索)
A3S Ecosystem
A3S Search is a utility component of the A3S ecosystem.
┌──────────────────────────────────────────────────────┐
│ A3S Ecosystem │
│ │
│ Infrastructure: a3s-box (MicroVM sandbox) │
│ │ │
│ Application: a3s-code (AI coding agent) │
│ / \ │
│ Utilities: a3s-lane a3s-context a3s-search │
│ (queue) (memory) (search) │
│ ▲ │
│ │ │
│ You are here │
└──────────────────────────────────────────────────────┘
Standalone Usage: a3s-search works independently for any meta search needs:
- AI agents needing web search capabilities
- Privacy-focused search aggregation
- Research tools requiring multi-source results
- Any application needing unified search across engines
Roadmap
Phase 1: Core ✅ (Complete)
- Engine trait abstraction
- Result deduplication by URL
- Consensus-based ranking algorithm
- Parallel async search execution
- Per-engine timeout handling
- 8 built-in engines (4 international + 4 Chinese)
- Headless browser support for JS-rendered engines (Google, Baidu, Bing China — enabled by default)
- PageFetcher abstraction (HttpFetcher + BrowserFetcher)
- BrowserPool with tab concurrency control
- Proxy pool with dynamic provider support
- CLI tool with Homebrew distribution
- Automatic Chrome detection and download (Chrome for Testing)
- 251 comprehensive unit tests with 91.15% line coverage
License
MIT