Skip to main content

Module screen

Module screen 

Source
Expand description

Stargazer screening — bulk-fetch a repo’s stargazers and their public repo portfolios over cheap REST, classify each person into an archetype, and render a compact overview the agent can drill into.

Cost model (the whole point): one REST request per stargazer-page + one per stargazer for their owned-repos list. No GraphQL, no per-repo calls, no READMEs in the bulk pass. The raw payloads (~550 KB for a prolific user) are projected down to a handful of fields before anything leaves Rust; the agent-facing overview stays ~2 KB regardless of how many people starred the repo.

The value-bearing logic (project_repo, profile_user, build_overview) is pure over already-fetched JSON, so it unit-tests without the network. screen_repo is the orchestrator that fetches.

Structs§

CachedScreen
One cached screen: the enriched profiles, run metadata, and the effective config (with any auto-derived keywords/stack) so a re-rank can reuse them.
Filters
Conjunctive (AND) filters on the metric axes. All set predicates must pass. Absolute thresholds where the agent has intuition (keywords, stars, dates) + percentile thresholds on the normalized axes.
RepoLite
The cheap fields we keep from a repo object — everything else (URLs, owner blob, permissions) is dropped at projection time.
Scores
Normalized 0–1 metric vector for a person, percentile-ranked within the screened set. The basis axes for ranking and fan-out gating: they disagree with each other (effort ≠ popularity ≠ recency ≠ relatedness), which is what makes each a real axis rather than a derived view.
ScreenConfig
Tuning knobs for a screen. Defaults are chosen for the kglite-scale case (dozens-to-low-hundreds of stargazers).
ScreenMeta
Run metadata returned alongside the profiles — what was auto-derived, how much was screened, and whether the result is partial. Drives the overview header and the scale/cost story.
ScreenStore
In-memory store of screened profile sets, keyed by seed repo. Lives for the server’s lifetime — the stargazer-screen analogue of ElementCache.
Selection
A complete selection: the filter predicate set, the ranking axis, and a human label. Drives both a focused overview view and the fan-out gate.
UserProfile
One classified stargazer.

Enums§

Archetype
RankBy
Scale
Disclosure scale — borrowed from kglite’s GraphScale. The detail level adapts to how many stargazers there are: a small set shows every notable person inline; a huge set collapses to statistics + drill.
Seed
What to screen: a repo (screen its stargazers) or an explicit set of users (screen them directly). The people-set differs; everything downstream — projection, classification, enrichment, scoring, selection — is identical.

Functions§

archetype_from_key
Resolve a cohort:<key> handle back to its archetype.
build_overview
Build the compact overview — the agent-facing digest. Scale-adaptive: inventory first (the cohort counts), then detail whose depth shrinks as the stargazer count grows.
derive_config
Auto-derive relevance keywords + stack from the seed repo itself, so a caller can screen_stargazers(repo) with no hand-tuned config. Topics and significant description words become keywords; the top languages (by bytes) become the stack. Returns (keywords, stack); empty on error.
drill
Resolve a drill element_id against an already-screened profile set. Mirrors github_issueselement_id convention: cohort:<key> · user:<login> · user:<login>/repo:<name> · user:<login>/repo:<name>/readme (the only drill that costs a request).
fetch_contributions
[item 7] Repos a user contributes to but does not own, from their recent public events — surfaces relevance the owned-repo list can’t see.
fetch_portfolio
Fetch + project one user’s owned repos. Returns (repos, capped).
fetch_readme
Fetch a repo’s README and compact it to a short headline gist. This is the only drill that costs a request, and it’s shortlist-only by design.
fetch_stargazer_logins
Fetch a repo’s stargazer logins, most-recent first (owner excluded). Uncapped — the caller samples [item 9].
fetch_user_reach
[item 6] Follower count for one user — reach signal for outreach.
find_adopters
[item 2] Find which of logins actually depend on the seed package. Code-searches dependency manifests for the package name, intersects the owners with the stargazer set, and verifies a bounded (whole-word) match against the real manifest line to reject substring collisions (e.g. pkglitekglite). Returns login → evidence line.
normalize_scores
Compute the normalized score vector for every profile from cheap raw signals: relatedness (keyword score), popularity (stars+followers), effort (repo size + breadth), recency (latest push). Percentile-ranked across the set so axes are comparable. Call after enrichment.
preset
Named preset = a bundled (filters, rank) for a common goal. Keeps the pipeline usable cold; raw filters are the power layer.
probe_colocation
[item 4] Count a dev’s original repos that combine all stack languages in one repo (the true PyO3/maturin co-location signal). Probes /languages on stack-language repos, bounded to keep it cheap.
profile_user
Classify a user from their (already projected) owned repos.
project_repo
Project a raw GitHub repo object down to the cheap fields.
render_cohort
Members of a cohort, rendered in full (the cohort:<name> drill).
render_repo
Render a single repo profile (drill-down level 2). The cheap version — everything here is already in the cache, no new request.
render_selection
Render a focused selection view (filter → rank → take), with explicit empty-result guidance rather than a silent empty section.
render_user
Render a single user’s full portfolio (drill-down level 1).
run_screen
Full screen: derive config if needed [item 1], sample stargazers [item 9], fetch portfolios, classify, then enrich the shortlist [items 2/4/6/7]. Returns the profiles, run metadata, and the effective config (so a cached re-rank can reuse the derived keywords/stack).
screen_dispatch
One-call entry point for the screen_stargazers MCP tool. Without element_id: build (or reuse) the screen and return the overview — re-classified against the current keywords/stack so the agent can re-key a cached fetch for free. With element_id: drill the cached set.
select
Filter → rank → take. The selection function, reused for the overview view and the fan-out gate (expand only these top-K).