Expand description
Core engine for the Adler OSINT username-search tool — runtime-agnostic, embed-friendly.
The CLI lives in adler-cli; this crate is what you reach for to
drive username detection from your own Rust code (a Discord bot
that checks usernames, a security tool that flags exposed
identities across a watchlist, a CI gate that asserts a name
isn’t claimed elsewhere, …).
§Quick start
Scan the embedded ~439-site registry for one username and print the hits:
use adler_core::{Client, ExecutorOptions, MatchKind, Registry, Username, executor};
let registry = Registry::default_embedded()?;
// filter(include, exclude, tags, exclude_tags, include_nsfw)
// — empty slices = no name/tag filter; `false` keeps the
// default NSFW auto-exclusion (matches Sherlock's `--nsfw`
// opt-in). Pass `true` (or `&["nsfw".into()]` as tags) to
// scan adult-content sites.
let sites = registry.filter(&[], &[], &[], &[], false);
let username = Username::new("torvalds")?;
let client = Client::builder().build()?;
let outcomes =
executor::run(&client, &sites, &username, ExecutorOptions::default()).await;
for outcome in outcomes.iter().filter(|o| o.kind == MatchKind::Found) {
println!("{} → {}", outcome.site, outcome.url);
}§Map of the public API
Detection plumbing:
Registry— loaded, validated collection of sites. Build from the embeddeddefault_embedded, from a JSON string (from_json_str), or from disk (load_from_path).Site,Signal,UrlTemplate,Extractor,KnownPresent— site-registry value types.Siteis serde-(de)serialisable; the JSON Schema lives indocs/sites.schema.json.Username— validated search target. Constructed viaUsername::new; invalid characters / overlong names are rejected at construction time.Client,ClientBuilder—reqwest-backed probe issuer. Knobs the builder exposes: timeout, redirect limit, per-host / global throttle, retry policy, user-agent rotation pool, proxy,robots.txtcache, browser backend, browser budget.CheckOutcome,MatchKind,UncertainReason— verdict types. The signal pipeline is negative-priority: anyNotFoundvote wins overFound; no votes →Uncertain. A per-siteregex_checkmismatch short-circuits withUncertainReason::UsernameNotAllowedbefore any HTTP request.executor— bounded-concurrency fan-out runner. Pass anExecutorOptionsto control concurrency, deadline, and progress callback.
Optional analysis:
correlate— group accounts that look like the same person across sites viaenrichedprofile fields.permute— generate username variants (alice → alice1, alice.dev, …) viaMAX_VARIANTS/PermuteLevel.doctor— registry health check (check_site), signature derivation (suggest_fix), known-present discovery (discover_known_present), site scaffolding (scaffold_site).
Bot-protected sites (Instagram, X/Twitter today):
BrowserBackendtrait — abstract real-Chrome driver. Configurable on theClientviaClientBuilder::browser. Built-in implementations:browser::local::LocalBackend(free, viachromiumoxide) andbrowser::browserbase::BrowserbaseBackend(cloud, residential IPs, in-tree raw async CDP client).BrowserBudgetcaps browser-routed fetches per scan to keep cost predictable.
§Cache
Cache persists per-(site, username, signal-signature) verdicts
between runs. Compose with Client via the builder or skip
entirely for one-shot scans.
§Error model
Result is a Result<T, Error> alias; Error is a single
crate-level thiserror enum. The probe path never surfaces
errors — transient network failures become
MatchKind::Uncertain with a typed UncertainReason, so
you get a partial result for every site even when the network is
flaky. Loader errors (malformed registry JSON, invalid CSS
selectors, regex compile failures) come back as Err.
§Version history
Pre-1.0 SemVer. Breaking changes since 0.1:
- 0.2.0 — added
Site::request_headers(BTreeMap<String, String>);BrowserBackend::fetchgained theheadersparameter;browsermodule becamepub. - 0.3.0 —
Site::known_presentchanged fromOption<String>toOption<KnownPresent>(the new enum accepts string-or-array via untagged serde);DoctorReport::Healthy::presentandUnhealthy::presentchanged fromOption<CheckOutcome>toVec<(String, CheckOutcome)>(one entry per probed candidate). - 0.4.0 —
Registry::filtergained a fifthinclude_nsfw: boolparameter (default-exclude adult sites);UncertainReasongainedUsernameNotAllowed;Site::regex_checkfield added (per-site username regex).
Each change has a migration block in the CHANGELOG.
Re-exports§
pub use browser::BrowserBackend;pub use browser::BrowserBudget;pub use browser::RenderedPage;pub use doctor::DoctorReport;pub use doctor::FixSuggestion;pub use executor::ExecutorOptions;
Modules§
- browser
- Browser backend for pages that are unusable from raw HTTP.
- doctor
- Site signature health check.
- executor
- Concurrent fan-out runner for site probes.
Structs§
- Cache
- In-memory cache backed by a JSON file.
- Check
Outcome - Result of probing a single site for a username.
- Client
- HTTP client used to probe sites.
- Client
Builder - Builder for
Client. - Cluster
- A group of accounts that likely belong to the same person.
- Correlation
Report - Result of correlating a scan’s outcomes.
- Engine
- Shared detection signature template for a family of sites that
run the same forum / blog / wiki software (Discourse, vBulletin,
XenForo,MediaWiki, …). Referenced fromSite::engine. - Extractor
- A rule for extracting one profile field from a page.
- RawResponse
- Raw response data returned by
Client::fetchfor diagnostics. - Registry
- A loaded, validated collection of site definitions.
- Site
- One site we can probe for the existence of an account.
- UrlTemplate
- URL template containing a
{username}placeholder. - Username
- A validated username.
Enums§
- Error
- Errors produced by the Adler engine.
- Known
Present - Known-present declaration on a
Site. - Match
Kind - Outcome of a single site probe.
- Permute
Level - How aggressively to expand a username into variants.
- Signal
- A single piece of evidence about whether an account exists.
- Uncertain
Reason - Why a probe was inconclusive.
Constants§
- DEFAULT_
BROWSER_ BUDGET - Default ceiling on browser-backed probes per scan when no other value is specified.
- LINK_
THRESHOLD - Minimum pairwise score to link two accounts.
- MAX_
VARIANTS - Hard cap on the number of variants returned (including the original).
Functions§
- correlate
- Correlate
Foundaccounts by their enrichment fields. - permute
- Expand
usernameinto a deduplicated list of variants perlevel.
Type Aliases§
- Result
- Result alias used throughout the engine.