Expand description
Stargazer screening — bulk-fetch a repo’s stargazers and their public repo portfolios over cheap REST, classify each person into an archetype, and render a compact overview the agent can drill into.
Cost model (the whole point): one REST request per stargazer-page + one per stargazer for their owned-repos list. No GraphQL, no per-repo calls, no READMEs in the bulk pass. The raw payloads (~550 KB for a prolific user) are projected down to a handful of fields before anything leaves Rust; the agent-facing overview stays ~2 KB regardless of how many people starred the repo.
The value-bearing logic (project_repo, profile_user,
build_overview) is pure over already-fetched JSON, so it unit-tests
without the network. screen_repo is the orchestrator that fetches.
Structs§
- Cached
Screen - One cached screen: the enriched profiles, run metadata, and the effective config (with any auto-derived keywords/stack) so a re-rank can reuse them.
- Filters
- Conjunctive (AND) filters on the metric axes. All set predicates must pass. Absolute thresholds where the agent has intuition (keywords, stars, dates) + percentile thresholds on the normalized axes.
- Repo
Lite - The cheap fields we keep from a repo object — everything else (URLs, owner blob, permissions) is dropped at projection time.
- Scores
- Normalized 0–1 metric vector for a person, percentile-ranked within the screened set. The basis axes for ranking and fan-out gating: they disagree with each other (effort ≠ popularity ≠ recency ≠ relatedness), which is what makes each a real axis rather than a derived view.
- Screen
Config - Tuning knobs for a screen. Defaults are chosen for the kglite-scale case (dozens-to-low-hundreds of stargazers).
- Screen
Meta - Run metadata returned alongside the profiles — what was auto-derived, how much was screened, and whether the result is partial. Drives the overview header and the scale/cost story.
- Screen
Store - In-memory store of screened profile sets, keyed by seed repo. Lives for
the server’s lifetime — the stargazer-screen analogue of
ElementCache. - Selection
- A complete selection: the filter predicate set, the ranking axis, and a human label. Drives both a focused overview view and the fan-out gate.
- User
Profile - One classified stargazer.
Enums§
- Archetype
- RankBy
- Scale
- Disclosure scale — borrowed from kglite’s
GraphScale. The detail level adapts to how many stargazers there are: a small set shows every notable person inline; a huge set collapses to statistics + drill. - Seed
- What to screen: a repo (screen its stargazers) or an explicit set of users (screen them directly). The people-set differs; everything downstream — projection, classification, enrichment, scoring, selection — is identical.
Functions§
- archetype_
from_ key - Resolve a
cohort:<key>handle back to its archetype. - build_
overview - Build the compact overview — the agent-facing digest. Scale-adaptive: inventory first (the cohort counts), then detail whose depth shrinks as the stargazer count grows.
- derive_
config - Auto-derive relevance keywords + stack from the seed repo itself, so a
caller can
screen_stargazers(repo)with no hand-tuned config. Topics and significant description words become keywords; the top languages (by bytes) become the stack. Returns (keywords, stack); empty on error. - drill
- Resolve a drill
element_idagainst an already-screened profile set. Mirrorsgithub_issues’element_idconvention:cohort:<key>·user:<login>·user:<login>/repo:<name>·user:<login>/repo:<name>/readme(the only drill that costs a request). - fetch_
contributions - [item 7] Repos a user contributes to but does not own, from their recent public events — surfaces relevance the owned-repo list can’t see.
- fetch_
portfolio - Fetch + project one user’s owned repos. Returns (repos, capped).
- fetch_
readme - Fetch a repo’s README and compact it to a short headline gist. This is the only drill that costs a request, and it’s shortlist-only by design.
- fetch_
stargazer_ logins - Fetch a repo’s stargazer logins, most-recent first (owner excluded). Uncapped — the caller samples [item 9].
- fetch_
user_ reach - [item 6] Follower count for one user — reach signal for outreach.
- find_
adopters - [item 2] Find which of
loginsactually depend on the seed package. Code-searches dependency manifests for the package name, intersects the owners with the stargazer set, and verifies a bounded (whole-word) match against the real manifest line to reject substring collisions (e.g.pkglite⊃kglite). Returns login → evidence line. - normalize_
scores - Compute the normalized score vector for every profile from cheap raw signals: relatedness (keyword score), popularity (stars+followers), effort (repo size + breadth), recency (latest push). Percentile-ranked across the set so axes are comparable. Call after enrichment.
- preset
- Named preset = a bundled (filters, rank) for a common goal. Keeps the pipeline usable cold; raw filters are the power layer.
- probe_
colocation - [item 4] Count a dev’s original repos that combine all stack languages
in one repo (the true PyO3/maturin co-location signal). Probes
/languageson stack-language repos, bounded to keep it cheap. - profile_
user - Classify a user from their (already projected) owned repos.
- project_
repo - Project a raw GitHub repo object down to the cheap fields.
- render_
cohort - Members of a cohort, rendered in full (the
cohort:<name>drill). - render_
repo - Render a single repo profile (drill-down level 2). The cheap version — everything here is already in the cache, no new request.
- render_
selection - Render a focused selection view (filter → rank → take), with explicit empty-result guidance rather than a silent empty section.
- render_
user - Render a single user’s full portfolio (drill-down level 1).
- run_
screen - Full screen: derive config if needed [item 1], sample stargazers [item 9], fetch portfolios, classify, then enrich the shortlist [items 2/4/6/7]. Returns the profiles, run metadata, and the effective config (so a cached re-rank can reuse the derived keywords/stack).
- screen_
dispatch - One-call entry point for the
screen_stargazersMCP tool. Withoutelement_id: build (or reuse) the screen and return the overview — re-classified against the current keywords/stack so the agent can re-key a cached fetch for free. Withelement_id: drill the cached set. - select
- Filter → rank → take. The selection function, reused for the overview view and the fan-out gate (expand only these top-K).