A3S Search
Overview
A3S Search is an embeddable meta search engine library inspired by SearXNG. It aggregates search results from multiple search engines, deduplicates them, and ranks them using a consensus-based scoring algorithm.
Basic Usage
use ;
async
Features
- Multi-Engine Search: Aggregate results from multiple search engines in parallel
- Result Deduplication: Merge duplicate results based on normalized URLs
- Consensus Ranking: Results found by multiple engines rank higher
- Configurable Weights: Adjust engine influence on final rankings
- Async-First: Built on Tokio for high-performance concurrent searches
- Timeout Handling: Per-engine timeout with graceful degradation
- Extensible: Easy to add custom search engines via the
Enginetrait - Dynamic Proxy Pool: IP rotation with pluggable
ProxyProvidertrait and auto-refresh - Health Monitor: Automatic engine suspension after repeated failures with configurable recovery
- HCL Configuration: Load engine and health settings from HCL config files
- Headless Browser: Optional Chrome/Chromium integration for JS-rendered engines (feature-gated)
- Auto-Install Chrome: Automatically detects or downloads Chrome for Testing when no browser is found
- PageFetcher Abstraction: Pluggable page fetching —
HttpFetcher,PooledHttpFetcher, orBrowserFetcher - CLI Tool: Command-line interface for quick searches
- Native SDKs: TypeScript (NAPI) and Python (PyO3) bindings with async support and dynamic proxy pool management
CLI Usage
Installation
Homebrew (macOS):
Cargo:
Commands
# Basic search (uses DuckDuckGo and Wikipedia by default)
# Search with specific engines
# Search with Google (Chrome auto-installed if needed)
# Search with Chinese headless engines
# Limit results
# JSON output
# Compact output (tab-separated)
# Use proxy
# SOCKS5 proxy
# Verbose mode
# List available engines
Available Engines
| Shortcut | Engine | Description |
|---|---|---|
ddg |
DuckDuckGo | Privacy-focused search |
brave |
Brave | Brave Search |
bing |
Bing | Bing International |
wiki |
Wikipedia | Wikipedia API |
sogou |
Sogou | 搜狗搜索 |
360 |
360 Search | 360搜索 |
g |
Google Search (Chrome auto-installed) | |
baidu |
Baidu | 百度搜索 (Chrome auto-installed) |
bing_cn |
Bing China | 必应中国 (Chrome auto-installed) |
Supported Search Engines
International Engines
| Engine | Shortcut | Description |
|---|---|---|
| DuckDuckGo | ddg |
Privacy-focused search |
| Brave | brave |
Brave Search |
| Bing | bing |
Bing International |
| Wikipedia | wiki |
Wikipedia API |
g |
Google Search (headless browser) |
Chinese Engines (中国搜索引擎)
| Engine | Shortcut | Description |
|---|---|---|
| Sogou | sogou |
搜狗搜索 |
| So360 | 360 |
360搜索 |
| Baidu | baidu |
百度搜索 (headless browser) |
| Bing China | bing_cn |
必应中国 (headless browser) |
Automatic Chrome Setup
When using headless engines (g, baidu, bing_cn), Chrome/Chromium is required. A3S Search handles this automatically:
- Detect — Checks
CHROMEenv var, PATH commands, and well-known install paths - Cache — Looks for a previously downloaded Chrome in
~/.a3s/chromium/ - Download — If not found, downloads Chrome for Testing from Google's official CDN
Supported platforms: macOS (arm64, x64) and Linux (x64).
# First run: Chrome is auto-downloaded if not installed
# Fetching Chrome for Testing version info...
# Downloading Chrome for Testing v145.0.7632.46 (mac-arm64)...
# Downloaded 150.2 MB, extracting...
# Chrome for Testing v145.0.7632.46 installed successfully!
# Subsequent runs: uses cached Chrome instantly
# Or set CHROME env var to use a specific binary
CHROME=/usr/bin/chromium
SDKs
Native bindings for TypeScript and Python, powered by NAPI-RS and PyO3. No subprocess spawning — direct FFI calls to the Rust library.
TypeScript (Node.js)
&&
import { A3SSearch } from '@a3s-lab/search';
const search = new A3SSearch();
// Simple search (uses DuckDuckGo + Wikipedia by default)
const response = await search.search('rust programming');
// With options
const response = await search.search('rust programming', {
engines: ['ddg', 'wiki', 'brave', 'bing'],
limit: 5,
timeout: 15,
proxy: 'http://127.0.0.1:8080',
});
// Dynamic proxy pool (IP rotation)
await search.setProxyPool([
'http://10.0.0.1:8080',
'http://10.0.0.2:8080',
'socks5://10.0.0.3:1080',
]);
const response = await search.search('rust programming');
// Toggle proxy pool at runtime
search.setProxyPoolEnabled(false); // direct connection
search.setProxyPoolEnabled(true); // re-enable rotation
for (const r of response.results) {
console.log(`${r.title}: ${r.url} (score: ${r.score})`);
}
console.log(`${response.count} results in ${response.durationMs}ms`);
Python
=
# Simple search (uses DuckDuckGo + Wikipedia by default)
= await
# With options
= await
# Dynamic proxy pool (IP rotation)
await
= await
# Toggle proxy pool at runtime
# direct connection
# re-enable rotation
SDK Available Engines
Both SDKs support HTTP-based engines (no headless browser required):
| Shortcut | Aliases | Engine |
|---|---|---|
ddg |
duckduckgo |
DuckDuckGo |
brave |
— | Brave Search |
bing |
— | Bing International |
wiki |
wikipedia |
Wikipedia API |
sogou |
— | Sogou (搜狗) |
360 |
so360 |
360 Search (360搜索) |
SDK Tests
# Node.js (49 tests)
&&
# Python (54 tests)
&&
Quality Metrics
Test Coverage
298 library + 31 CLI + 103 SDK = 401 total tests with 91.15% Rust line coverage:
| Module | Lines | Coverage | Functions | Coverage |
|---|---|---|---|---|
| engine.rs | 116 | 100.00% | 17 | 100.00% |
| error.rs | 52 | 100.00% | 10 | 100.00% |
| query.rs | 114 | 100.00% | 20 | 100.00% |
| result.rs | 194 | 100.00% | 35 | 100.00% |
| aggregator.rs | 292 | 100.00% | 30 | 100.00% |
| search.rs | 337 | 99.41% | 58 | 100.00% |
| proxy.rs | 410 | 99.02% | 91 | 96.70% |
| engines/duckduckgo.rs | 236 | 97.46% | 27 | 81.48% |
| engines/bing_china.rs | 164 | 96.95% | 18 | 77.78% |
| engines/baidu.rs | 146 | 96.58% | 17 | 76.47% |
| engines/google.rs | 180 | 96.11% | 19 | 73.68% |
| engines/brave.rs | 140 | 95.71% | 20 | 75.00% |
| engines/so360.rs | 132 | 95.45% | 18 | 77.78% |
| engines/sogou.rs | 131 | 95.42% | 17 | 76.47% |
| fetcher_http.rs | 29 | 93.10% | 7 | 85.71% |
| fetcher.rs | 73 | 93.15% | 10 | 100.00% |
| engines/wikipedia.rs | 153 | 90.85% | 26 | 88.46% |
| browser.rs | 244 | 68.85% | 42 | 61.90% |
| browser_setup.rs | 406 | 58.13% | 65 | 49.23% |
| TOTAL | 3549 | 91.15% | 547 | 84.10% |
Note: browser.rs and browser_setup.rs have lower coverage because BrowserPool::acquire_browser(), BrowserFetcher::fetch(), and download_chrome() require a running Chrome process or network access. Integration tests verify real browser functionality but are #[ignore] by default.
SDK tests (49 Node.js + 54 Python = 103 tests) cover error classes, type contracts, input validation, engine validation, and integration with all 5 HTTP engines.
Run coverage report:
# Default (19 modules, 267 tests, 91.15% coverage)
# Without headless (14 modules)
# Detailed file-by-file table
# HTML report (opens in browser)
Running Tests
# Default build (9 engines, 244+ lib tests)
# Without headless (6 engines)
# Integration tests (requires network + Chrome for Google)
# With progress display (via justfile)
# SDK tests (requires native build first)
&& &&
Architecture
Ranking Algorithm
The scoring algorithm is based on SearXNG's approach:
score = Σ (weight / position) for each engine
weight = engine_weight × num_engines_found
Key factors:
- Engine Weight: Configurable per-engine multiplier (default: 1.0)
- Consensus: Results found by multiple engines score higher
- Position: Earlier positions in individual engines score higher
Components
┌─────────────────────────────────────────────────────┐
│ Search │
│ ┌───────────────────────────────────────────────┐ │
│ │ Engine Registry │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │DuckDuck │ │ Brave │ │Wikipedia│ ... │ │
│ │ │ Go │ │ │ │ │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ │ │
│ │ ┌─────────────────────────────────┐ │ │
│ │ │ Google (headless browser) │ │ │
│ │ │ └─ PageFetcher → BrowserPool │ │ │
│ │ └─────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────┘ │
│ ↓ parallel search │
│ ┌───────────────────────────────────────────────┐ │
│ │ Aggregator │ │
│ │ • Deduplicate by normalized URL │ │
│ │ • Merge results from multiple engines │ │
│ │ • Calculate consensus-based scores │ │
│ │ • Sort by score (descending) │ │
│ └───────────────────────────────────────────────┘ │
│ ↓ │
│ SearchResults │
└─────────────────────────────────────────────────────┘
PageFetcher (trait)
├── HttpFetcher (reqwest, plain HTTP, single proxy)
├── PooledHttpFetcher (reqwest, proxy pool rotation)
└── BrowserFetcher (chromiumoxide, headless Chrome)
└── BrowserPool (shared process, tab semaphore)
Quick Start
Installation
Add to your Cargo.toml:
[]
= "0.6"
= { = "1", = ["full"] }
# To disable headless browser support:
# a3s-search = { version = "0.6", default-features = false }
Basic Search
use ;
let mut search = new;
search.add_engine;
let query = new;
let results = search.search.await?;
println!;
Chinese Search (中文搜索)
use ;
let mut search = new;
search.add_engine; // 搜狗
search.add_engine; // 360搜索
let query = new;
let results = search.search.await?;
Query Options
use ;
let query = new
.with_categories
.with_language
.with_safesearch
.with_page
.with_time_range;
Custom Engine Weights
use ;
// Wikipedia results will have 1.5x weight
let wiki = new.with_config;
let mut search = new;
search.add_engine;
Using Proxy Pool (Anti-Crawler Protection)
use Arc;
use ;
use ;
use ;
// Create a proxy pool with multiple proxies
let pool = new;
// PooledHttpFetcher rotates proxies per request
let fetcher: = new;
let mut search = new;
search.add_engine;
let query = new;
let results = search.search.await?;
// Toggle proxy pool at runtime (thread-safe via AtomicBool)
pool.set_enabled; // direct connection
pool.set_enabled; // re-enable rotation
Dynamic Proxy Provider
use Arc;
use ;
use async_trait;
use Duration;
// Implement custom proxy provider (e.g., from API, Redis, database)
// Use with auto-refresh background task
let pool = new;
let _refresh_handle = spawn_auto_refresh;
// Pool now auto-refreshes every 60 seconds
Implementing Custom Engines
use ;
use async_trait;
API Reference
Search
| Method | Description |
|---|---|
new() |
Create a new search instance |
with_health_config(config) |
Create with health monitoring |
add_engine(engine) |
Add a search engine |
set_timeout(duration) |
Set default search timeout |
engine_count() |
Get number of configured engines |
search(query) |
Perform a search |
SearchQuery
| Method | Description |
|---|---|
new(query) |
Create a new query |
with_categories(cats) |
Set target categories |
with_language(lang) |
Set language/locale |
with_safesearch(level) |
Set safe search level |
with_page(page) |
Set page number |
with_time_range(range) |
Set time range filter |
with_engines(engines) |
Limit to specific engines |
SearchResult
| Field | Type | Description |
|---|---|---|
url |
String |
Result URL |
title |
String |
Result title |
content |
String |
Result snippet |
result_type |
ResultType |
Type of result |
engines |
HashSet<String> |
Engines that found this |
positions |
Vec<u32> |
Positions in each engine |
score |
f64 |
Calculated ranking score |
thumbnail |
Option<String> |
Thumbnail URL |
published_date |
Option<String> |
Publication date |
SearchResults
| Method | Description |
|---|---|
items() |
Get result slice |
suggestions() |
Get query suggestions |
answers() |
Get direct answers |
count |
Number of results |
duration_ms |
Search duration in ms |
Engine Trait
EngineConfig
| Field | Type | Default | Description |
|---|---|---|---|
name |
String |
- | Display name |
shortcut |
String |
- | Short identifier |
categories |
Vec<EngineCategory> |
[General] |
Categories |
weight |
f64 |
1.0 |
Ranking weight |
timeout |
u64 |
5 |
Timeout in seconds |
enabled |
bool |
true |
Is enabled |
paging |
bool |
false |
Supports pagination |
safesearch |
bool |
false |
Supports safe search |
ProxyPool
| Method | Description |
|---|---|
new() |
Create empty proxy pool (disabled) |
with_proxies(proxies) |
Create with static proxy list |
with_provider(provider) |
Create with dynamic provider |
with_strategy(strategy) |
Set selection strategy |
set_enabled(bool) |
Enable/disable proxy pool (thread-safe, &self) |
is_enabled() |
Check if enabled |
refresh() |
Refresh proxies from provider |
get_proxy() |
Get next proxy (based on strategy) |
add_proxy(proxy) |
Add a proxy to pool |
remove_proxy(host, port) |
Remove a proxy |
len() |
Number of proxies in pool |
create_client(user_agent) |
Create HTTP client with proxy |
PooledHttpFetcher
| Method | Description |
|---|---|
new(pool) |
Create with Arc<ProxyPool> — rotates proxy per request |
with_timeout(duration) |
Set request timeout (default: 30s) |
spawn_auto_refresh
Spawns a background task that periodically calls pool.refresh() based on the provider's refresh_interval(). Returns a handle that can be aborted to stop refreshing.
HealthMonitor / HealthConfig
| Field/Method | Description |
|---|---|
HealthConfig { max_failures, suspend_duration } |
Configure failure threshold and suspension time |
Search::with_health_config(config) |
Create search with health monitoring |
Engines are automatically suspended after max_failures consecutive failures and re-enabled after suspend_duration.
SearchConfig (HCL)
| Method | Description |
|---|---|
SearchConfig::load(path) |
Load config from .hcl file |
SearchConfig::parse(content) |
Parse HCL string |
health_config() |
Get HealthConfig from config |
enabled_engines() |
Get list of enabled engine shortcuts |
Example HCL config:
timeout = 10
health {
max_failures = 5
suspend_seconds = 120
}
engine "ddg" {
enabled = true
weight = 1.0
}
engine "bing" {
enabled = true
weight = 1.2
}
ProxyConfig
| Method | Description |
|---|---|
new(host, port) |
Create HTTP proxy config |
with_protocol(protocol) |
Set protocol (Http/Https/Socks5) |
with_auth(user, pass) |
Set authentication |
url() |
Get proxy URL string |
ProxyStrategy
| Variant | Description |
|---|---|
RoundRobin |
Rotate through proxies sequentially |
Random |
Select random proxy each time |
Development
Dependencies
| Dependency | Install | Purpose |
|---|---|---|
cargo-llvm-cov |
cargo install cargo-llvm-cov |
Code coverage (optional) |
lcov |
brew install lcov / apt install lcov |
Coverage report formatting (optional) |
| Chrome/Chromium | Auto-installed | For headless browser engines (auto-downloaded if not found) |
Build Commands
# Build (default, 8 engines including headless)
# Build without headless browser support (5 engines)
# Build release
# Test (with colored progress display)
# Test subsets
# Coverage (requires cargo-llvm-cov)
# Format & Lint
# Utilities
Project Structure
search/
├── Cargo.toml
├── justfile
├── README.md
├── examples/
│ ├── basic_search.rs # Basic usage example
│ └── chinese_search.rs # Chinese engines example
├── tests/
│ └── integration.rs # Integration tests (network-dependent)
├── sdk/
│ ├── node/ # TypeScript SDK (NAPI-RS)
│ │ ├── Cargo.toml # Rust cdylib crate
│ │ ├── src/ # Rust NAPI bindings
│ │ ├── lib/ # TypeScript wrappers
│ │ ├── tests/ # vitest tests (49 tests)
│ │ └── package.json
│ └── python/ # Python SDK (PyO3)
│ ├── Cargo.toml # Rust cdylib crate
│ ├── src/ # Rust PyO3 bindings
│ ├── a3s_search/ # Python wrappers
│ ├── tests/ # pytest tests (54 tests)
│ └── pyproject.toml
└── src/
├── main.rs # CLI entry point
├── lib.rs # Library entry point
├── engine.rs # Engine trait and config
├── error.rs # Error types
├── query.rs # SearchQuery
├── result.rs # SearchResult, SearchResults
├── aggregator.rs # Result aggregation and ranking
├── search.rs # Search orchestrator with HealthMonitor
├── config.rs # HCL configuration loading
├── health.rs # HealthMonitor, HealthConfig
├── proxy.rs # ProxyPool, ProxyProvider, spawn_auto_refresh
├── fetcher.rs # PageFetcher trait, WaitStrategy
├── fetcher_http.rs # HttpFetcher + PooledHttpFetcher
├── html_engine.rs # HtmlEngine<P> generic engine framework
├── browser.rs # BrowserPool, BrowserFetcher (headless browser)
├── browser_setup.rs # Chrome auto-detection and download
└── engines/
├── mod.rs # Engine exports
├── duckduckgo.rs # DuckDuckGo
├── brave.rs # Brave Search
├── bing.rs # Bing International
├── google.rs # Google (headless browser)
├── wikipedia.rs # Wikipedia
├── baidu.rs # Baidu (百度, headless browser)
├── bing_china.rs # Bing China (必应中国, headless browser)
├── sogou.rs # Sogou (搜狗)
└── so360.rs # 360 Search (360搜索)
A3S Ecosystem
A3S Search is a utility component of the A3S ecosystem.
┌──────────────────────────────────────────────────────┐
│ A3S Ecosystem │
│ │
│ Infrastructure: a3s-box (MicroVM sandbox) │
│ │ │
│ Application: a3s-code (AI coding agent) │
│ / \ │
│ Utilities: a3s-lane a3s-context a3s-search │
│ (queue) (memory) (search) │
│ ▲ │
│ │ │
│ You are here │
└──────────────────────────────────────────────────────┘
Standalone Usage: a3s-search works independently for any meta search needs:
- AI agents needing web search capabilities
- Privacy-focused search aggregation
- Research tools requiring multi-source results
- Any application needing unified search across engines
Roadmap
Phase 1: Core ✅ (Complete)
- Engine trait abstraction
- Result deduplication by URL
- Consensus-based ranking algorithm
- Parallel async search execution
- Per-engine timeout handling
- 9 built-in engines (5 international + 4 Chinese)
- Bing International engine (HTTP, no headless required)
- Headless browser support for JS-rendered engines (Google, Baidu, Bing China — enabled by default)
- PageFetcher abstraction (HttpFetcher + PooledHttpFetcher + BrowserFetcher)
- BrowserPool with tab concurrency control
- Dynamic proxy pool with pluggable
ProxyProvidertrait andspawn_auto_refresh -
PooledHttpFetcherfor per-request proxy IP rotation - Runtime proxy pool toggle via
AtomicBool(set_enabled(&self)) - Health monitoring with automatic engine suspension and recovery
- HCL configuration file loading for engines and health settings
- CLI tool with Homebrew distribution
- Automatic Chrome detection and download (Chrome for Testing)
- Proxy support for all engines via
-pflag (HTTP/HTTPS/SOCKS5) - UTF-8 safe content truncation for CJK/emoji
- Native SDKs: TypeScript (NAPI-RS) and Python (PyO3) with dynamic proxy pool management
- SDK proxy pool:
setProxyPool(),setProxyPoolEnabled(), per-requestproxyPooloption
License
MIT