pub struct Registry { /* private fields */ }Expand description
A loaded, validated collection of site definitions.
Engines (shared signature templates referenced by Site::engine)
are resolved into sites at load time — by the time you call
Registry::sites every entry already has its inherited
signals / request_headers / regex_check materialised. The original
Engine objects are kept on the registry for re-export and
inspection via Registry::engines.
Implementations§
Source§impl Registry
impl Registry
Sourcepub fn default_embedded() -> Result<Self>
pub fn default_embedded() -> Result<Self>
Load the default site list embedded into the crate at build time.
Sourcepub fn default_embedded_with_wmn() -> Result<Self>
pub fn default_embedded_with_wmn() -> Result<Self>
Load the default site list plus the WhatsMyName-derived
supplementary set. WhatsMyName data is licensed CC BY-SA 4.0
(see LICENSE-CC-BY-SA-4.0 at the repo root); enabling this
path means downstream redistribution of the merged scan data
must respect the ShareAlike obligation. Sites contributed by
the WhatsMyName tranche carry the source:wmn tag for
provenance.
Engines from the WMN tranche merge with the MIT tranche; case-insensitive site-name collisions resolve in favour of the MIT-tranche entry (the hand-curated Sherlock/Maigret-derived signature wins; the WMN duplicate is dropped). Returns an error only if either tranche fails its own validation — engine references are checked across the merged set.
Sourcepub fn from_json_str(json: &str) -> Result<Self>
pub fn from_json_str(json: &str) -> Result<Self>
Parse and validate a registry from a JSON string. Engine
references on each site are resolved before validation;
a site that names an engine which doesn’t exist in the
engines block fails loading with Error::InvalidSite.
Sourcepub fn engines(&self) -> &BTreeMap<String, Engine>
pub fn engines(&self) -> &BTreeMap<String, Engine>
Inheritable engine templates, keyed by name. Useful for introspection and for serialising the registry back out; detection paths read the resolved fields off the sites directly and don’t need to consult this map.
Sourcepub fn load_from_path(path: impl AsRef<Path>) -> Result<Self>
pub fn load_from_path(path: impl AsRef<Path>) -> Result<Self>
Read a registry from a JSON file.
Sourcepub fn is_empty(&self) -> bool
pub fn is_empty(&self) -> bool
True if the registry has no sites (always false for a valid load, since we’d already have rejected it).
Sourcepub fn filter(
&self,
include: &[String],
exclude: &[String],
tags: &[String],
exclude_tags: &[String],
include_nsfw: bool,
) -> Vec<Site>
pub fn filter( &self, include: &[String], exclude: &[String], tags: &[String], exclude_tags: &[String], include_nsfw: bool, ) -> Vec<Site>
Apply include/exclude name filters and a tag filter.
- If
includeis non-empty, only sites whose name contains at least one include term (case-insensitive substring) are kept. - Sites whose name contains any exclude term are dropped.
- If
tagsis non-empty, only sites carrying at least one of the requested tags are kept (case-insensitive). A site with no tags is therefore dropped by a tag filter — asking for--tag socialmeans “only social-tagged sites”. - Sites carrying any tag in
exclude_tagsare dropped (e.g.--exclude-tag bot-protectedfor a fast clean run). - NSFW sites are auto-excluded (the
nsfwtag) unlessinclude_nsfwistrueortagsexplicitly asks fornsfw. This matches Sherlock’s--nsfwopt-in pattern and prevents the defaultadler <username>from surfacing adult-site URLs the user didn’t ask for. - Sites are returned by value (cloned) so the result is independent of the registry’s lifetime — convenient for handing to the executor.
Sourcepub fn tag_counts(&self) -> Vec<(String, usize)>
pub fn tag_counts(&self) -> Vec<(String, usize)>
Distinct tags across all sites, sorted, with the count of sites
carrying each. Powers --list-tags.