1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
//! Core engine for the [Adler](https://github.com/commit3296/adler)
//! OSINT username-search tool — runtime-agnostic, embed-friendly.
//!
//! The CLI lives in `adler-cli`; this crate is what you reach for to
//! drive username detection from your own Rust code (a Discord bot
//! that checks usernames, a security tool that flags exposed
//! identities across a watchlist, a CI gate that asserts a name
//! isn't claimed elsewhere, …).
//!
//! ## Quick start
//!
//! Scan the embedded ~439-site registry for one username and print
//! the hits:
//!
//! ```no_run
//! use adler_core::{Client, ExecutorOptions, MatchKind, Registry, Username, executor};
//!
//! # async fn run() -> adler_core::Result<()> {
//! let registry = Registry::default_embedded()?;
//!
//! // filter(include, exclude, tags, exclude_tags, include_nsfw)
//! // — empty slices = no name/tag filter; `false` keeps the
//! // default NSFW auto-exclusion (matches Sherlock's `--nsfw`
//! // opt-in). Pass `true` (or `&["nsfw".into()]` as tags) to
//! // scan adult-content sites.
//! let sites = registry.filter(&[], &[], &[], &[], false);
//!
//! let username = Username::new("torvalds")?;
//! let client = Client::builder().build()?;
//!
//! let outcomes =
//! executor::run(&client, &sites, &username, ExecutorOptions::default()).await;
//!
//! for outcome in outcomes.iter().filter(|o| o.kind == MatchKind::Found) {
//! println!("{} → {}", outcome.site, outcome.url);
//! }
//! # Ok(())
//! # }
//! ```
//!
//! ## Map of the public API
//!
//! Detection plumbing:
//!
//! - [`Registry`] — loaded, validated collection of sites. Build from
//! the embedded [`default_embedded`](Registry::default_embedded),
//! from a JSON string ([`from_json_str`](Registry::from_json_str)),
//! or from disk ([`load_from_path`](Registry::load_from_path)).
//! - [`Site`], [`Signal`], [`UrlTemplate`], [`Extractor`],
//! [`KnownPresent`] — site-registry value types. `Site` is
//! serde-(de)serialisable; the JSON Schema lives in `docs/sites.schema.json`.
//! - [`Username`] — validated search target. Constructed via
//! [`Username::new`](Username::new); invalid characters / overlong
//! names are rejected at construction time.
//! - [`Client`], [`ClientBuilder`] — `reqwest`-backed probe issuer.
//! Knobs the builder exposes: timeout, redirect limit, per-host /
//! global throttle, retry policy, user-agent rotation pool, proxy,
//! `robots.txt` cache, browser backend, browser budget.
//! - [`CheckOutcome`], [`MatchKind`], [`UncertainReason`] — verdict
//! types. The signal pipeline is *negative-priority*: any
//! `NotFound` vote wins over `Found`; no votes → `Uncertain`. A
//! per-site `regex_check` mismatch short-circuits with
//! [`UncertainReason::UsernameNotAllowed`] before any HTTP request.
//! - [`executor`] — bounded-concurrency fan-out runner. Pass an
//! [`ExecutorOptions`] to control concurrency, deadline, and
//! progress callback.
//!
//! Optional analysis:
//!
//! - [`correlate`] — group accounts that look like the same person
//! across sites via [`enriched`](crate::correlate::correlate)
//! profile fields.
//! - [`permute`] — generate username variants
//! (alice → alice1, alice.dev, …) via [`MAX_VARIANTS`] /
//! [`PermuteLevel`].
//! - [`doctor`] — registry health check
//! ([`check_site`](crate::doctor::check_site)), signature
//! derivation ([`suggest_fix`](crate::doctor::suggest_fix)),
//! known-present discovery
//! ([`discover_known_present`](crate::doctor::discover_known_present)),
//! site scaffolding ([`scaffold_site`](crate::doctor::scaffold_site)).
//!
//! Bot-protected sites (Instagram, X/Twitter today):
//!
//! - [`BrowserBackend`] trait — abstract real-Chrome driver.
//! Configurable on the [`Client`] via
//! [`ClientBuilder::browser`](ClientBuilder::browser). Built-in
//! implementations: [`browser::local::LocalBackend`] (free, via
//! `chromiumoxide`) and
//! [`browser::browserbase::BrowserbaseBackend`] (cloud, residential
//! IPs, in-tree raw async CDP client). [`BrowserBudget`] caps
//! browser-routed fetches per scan to keep cost predictable.
//!
//! ## Cache
//!
//! [`Cache`] persists per-(site, username, signal-signature) verdicts
//! between runs. Compose with [`Client`] via the builder or skip
//! entirely for one-shot scans.
//!
//! ## Error model
//!
//! [`Result`] is a `Result<T, Error>` alias; [`Error`] is a single
//! crate-level `thiserror` enum. The probe path *never* surfaces
//! errors — transient network failures become
//! [`MatchKind::Uncertain`] with a typed [`UncertainReason`], so
//! you get a partial result for every site even when the network is
//! flaky. Loader errors (malformed registry JSON, invalid CSS
//! selectors, regex compile failures) come back as `Err`.
//!
//! ## Version history
//!
//! Pre-1.0 `SemVer`. Breaking changes since 0.1:
//!
//! - **0.2.0** — added [`Site::request_headers`] (`BTreeMap<String,
//! String>`); [`BrowserBackend::fetch`] gained the `headers`
//! parameter; [`browser`] module became `pub`.
//! - **0.3.0** — [`Site::known_present`] changed from
//! `Option<String>` to `Option<KnownPresent>` (the new enum
//! accepts string-or-array via untagged serde);
//! [`DoctorReport::Healthy::present`] and
//! `Unhealthy::present` changed from `Option<CheckOutcome>` to
//! `Vec<(String, CheckOutcome)>` (one entry per probed candidate).
//! - **0.4.0** — [`Registry::filter`] gained a fifth
//! `include_nsfw: bool` parameter (default-exclude adult sites);
//! [`UncertainReason`] gained `UsernameNotAllowed`;
//! [`Site::regex_check`] field added (per-site username regex).
//!
//! Each change has a migration block in [the
//! CHANGELOG](https://github.com/commit3296/adler/blob/main/CHANGELOG.md).
pub use ;
pub use Cache;
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;
pub use ExecutorOptions;
pub use ;
pub use Registry;
pub use ;
pub use Username;