repo-trust 0.1.1

A command-line tool that tells you whether an open-source repository deserves your trust — beyond the star count.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
# Product Requirements Document — Repo Trust

> **Version:** 1.0 (initial draft, May 2026)
> **Status:** Pre-build, Phase 0 (Research Foundation)
> **Owner:** @Dmitrze
> **License plan:** Apache-2.0 (project), CC-BY-4.0 (methodology docs)

---

## 1. Executive Summary

GitHub stars are the de facto popularity signal for open-source repositories, but they are an incomplete and increasingly distorted proxy for project quality. A 2024–2026 academic study (StarScout / ICSE 2026) identified approximately **6 million suspected fake stars across 18,617 repositories**, with fake-star activity surging in 2024 to the point where roughly **16% of repositories with 50+ stars showed fake-star campaign signals**. At the same time, real signals of project trustworthiness — maintainer concentration, release hygiene, downstream adoption, security posture — remain scattered across separate tools (OpenSSF Scorecard, deps.dev, Snyk Advisor, Socket.dev, libraries.io, ecosyste.ms).

**Repo Trust** is a developer-first, open-source command-line tool that produces a single, explainable, multi-dimensional **Trust Report** for any public GitHub repository. It is not a security scanner, not a star-fraud novelty detector, and not a SaaS dashboard. It is the missing **diligence layer** for engineers, analysts, scouts, and maintainers who need to answer one question quickly and defensibly:

> *Can this repository be trusted — for evaluation, for adoption as a dependency, for investment, or for inclusion in a curated list?*

The product calculates an overall Trust Score (0–100) and exposes five module-level scores with full evidence, confidence bands, and caveats. It runs locally, caches aggressively, respects API rate limits, produces machine-readable outputs (JSON, Markdown, CSV, SARIF), and is intentionally conservative when data is partial.

---

## 2. Problem Statement

### 2.1 The four problems we observe

1. **Popularity is gameable.** Fake-star marketplaces sell stars at $0.06–0.50 each; campaigns succeed in pushing repositories onto GitHub Trending. Discovery surfaces and VC-style "GitHub-as-traction" heuristics are systematically polluted.
2. **Evaluation is fragmented.** A serious evaluator currently consults at minimum: GitHub UI (commits, issues, releases, contributors), OpenSSF Scorecard (security), deps.dev or libraries.io (dependents), npm/PyPI download stats, OSV (vulnerabilities), and either Snyk Advisor or Socket.dev. There is no unified, scriptable, free, self-hostable view.
3. **Existing tools are point solutions.** OpenSSF Scorecard is excellent but security-only. Snyk Advisor and Socket.dev are SaaS-gated. deps.dev is API-only with no opinionated scoring or report. StarScout is a research artifact requiring BigQuery. Dagster's `fake-star-detector` is narrow and BigQuery-bound.
4. **Diligence is not reproducible.** Two evaluators looking at the same repo on different days, using ad-hoc heuristics, will reach different conclusions. There is no versioned scoring model that produces comparable outputs over time.

### 2.2 Who has this problem

| User segment | Current pain | What they need |
| --- | --- | --- |
| Application developers picking a dependency | Manual triage across 5+ tools, slow | One CLI invocation that gives "should I trust this?" |
| OSS maintainers benchmarking themselves or peers | Vanity metrics dominate, real strengths invisible | Module breakdown showing where they're strong/weak |
| Analysts at funds, accelerators, scouts | Need repeatable diligence at scale | Batch mode with CSV/JSON output for spreadsheets and CRMs |
| Security and platform engineers | Need supply-chain risk signal beyond CVEs | Trust signal that combines OSSF Scorecard + activity + adoption |
| Researchers and ecosystem curators | Need reproducible, versioned metrics | Scoring version pinning, snapshot mode, public methodology |
| Tech journalists, OSS directory builders | Need defensible, citable claims | Evidence-backed reports with caveats |

### 2.3 Out of scope

We are explicitly **not** building:

- A vulnerability scanner (use OSV, Trivy, Snyk, Socket).
- A code-quality static analyzer (use SonarQube, DeepSource, CodeQL).
- A license compliance tool (use FOSSA, ScanCode).
- A naming-and-shaming or fraud-adjudication system. Our language is probabilistic ("suspicious pattern", "weak readiness"), never definitive ("fraud").
- A SaaS dashboard. The CLI and local web viewer are the entire product surface in v1.

---

## 3. Competitive Landscape

| Tool | Type | Coverage | Open methodology | CLI-first | Free for individuals | Multi-dimensional trust |
| --- | --- | --- | --- | --- | --- | --- |
| **Repo Trust** *(this project)* | OSS CLI | GitHub public repos + ecosystem signals | ✅ Full | ✅ Yes | ✅ Yes | ✅ 5 modules |
| OpenSSF Scorecard | OSS CLI / GH Action | Security health (18 checks) | ✅ Full | ✅ Yes | ✅ Yes | ❌ Security only |
| Snyk Advisor | SaaS | Package + repo | ⚠️ Partial | ❌ Web-first | ⚠️ Limited tier | ⚠️ Health + popularity |
| deps.dev | Free API + Web | Packages, vulns, Scorecard | ✅ Yes | ❌ API-only | ✅ Yes | ❌ Aggregator, no opinion |
| Socket.dev | SaaS | npm/PyPI behavioral | ⚠️ Partial | ⚠️ CLI exists | ⚠️ Limited tier | ❌ Supply-chain only |
| StarScout (research) | Academic | GHArchive scale | ✅ Yes (paper) | ❌ Pipeline | ✅ Yes | ❌ Stars only |
| Dagster fake-star-detector | OSS | Single repo, BQ-bound | ✅ Yes | ⚠️ Dagster-bound | ⚠️ BQ free tier | ❌ Stars only |
| libraries.io | SaaS / OSS DB | Cross-package | ⚠️ Partial | ❌ Web | ✅ Yes | ⚠️ SourceRank only |

**Our positioning:** *The only locally-runnable, fully-open-source, CLI-first tool that combines fake-star signals, repo activity, maintainer concentration, ecosystem adoption, and security readiness into a single explainable Trust Report.*

We are complementary, not competitive, with OpenSSF Scorecard and deps.dev — we **consume their data** as inputs to our Adoption and Security modules.

---

## 4. Vision and Goals

### 4.1 Three-year vision

Repo Trust becomes the default `npm audit`-style command-line utility for repository diligence. Engineers run `repo-trust scan owner/repo` before adding a new dependency the same way they run `npm audit` after. Curators, scouts, and journalists cite Trust Reports the way they currently cite Scorecard scores.

### 4.2 Product goals (v1)

1. **One command, full report.** A single CLI invocation produces a complete, evidence-backed Trust Report for any public GitHub repository in under 30 seconds (Standard mode, warm cache).
2. **Explainable scoring.** Every module score is accompanied by ≥3 evidence items and an explicit confidence band (Low / Medium / High).
3. **Reproducibility.** Identical inputs and the same scoring version always produce identical outputs (modulo upstream API state). All scoring versions are pinned and migration-noted.
4. **Conservatism.** Where data is partial, the tool reports lower confidence rather than guessing. False positives in fake-star flagging are treated as worse than false negatives.
5. **Free and self-hostable.** No paid tier, no telemetry by default, no required server-side component.

### 4.3 Non-goals (v1)

- We will not provide an aggregate verdict ("safe" / "unsafe"). The closest we offer is a five-bucket category (Strong / Good / Mixed / Weak / High Risk).
- We will not analyze private repositories in v1. If users want this, they bring their own GitHub token with appropriate scopes; no special handling.
- We will not auto-publish reports anywhere. All outputs are local files until the user shares them.

---

## 5. Trust Model — Five Modules

We compute one **Repo Trust Score** (0–100) as a weighted aggregate of five module scores. The aggregate is useful for orientation; the module breakdown is the real product value.

### 5.1 Module weights (v1, illustrative)

| # | Module | Weight | Why this weight |
| --- | --- | --- | --- |
| 1 | Star Authenticity | 20% | Most-asked question; most novel value |
| 2 | Activity Health | 25% | Strongest single predictor of long-term project survival |
| 3 | Maintainer Health | 20% | Bus-factor risk is real and underweighted by popularity-only views |
| 4 | Adoption Signals | 20% | Real-world usage is the antidote to vanity metrics |
| 5 | Security & Readiness | 15% | Critical but well-served by OSSF Scorecard; we federate, not replicate |

Weights are configurable via `--weights` flag and a `weights.toml` file. Default weights are versioned (v1.0.0).

### 5.2 Module 1: Star Authenticity

**Question:** Are the popularity signals organic?

**Inputs (heuristic-driven, transparent):**
- Fork-to-star ratio (low ratio in a popular repo is suspicious).
- Watcher-to-star ratio.
- Median stargazer account age and account-creation distribution.
- Share of stargazer accounts matching the StarScout / Dagster "low-activity profile" (created recently, ≤1 follower, ≤4 public repos, default avatar, empty bio, star date == account creation date).
- Bursty / lockstep star timing (z-score of starring rate vs trailing baseline).
- Co-starring overlap with known campaign-cluster fingerprints (deep mode only, opt-in via `--deep`).

**Method:** A weighted evidence model in v1, not a black-box ML classifier. We follow the heuristic-first approach validated by Dagster (March 2023) and StarScout (ICSE 2026). Confidence drops sharply when stargazer sample size is below 100 or when GitHub API rate limits truncate the sample.

**Output:** Score 0–100, evidence list, confidence band, sample size disclosure.

**Anti-misuse:** Score language is always conditional ("X% of sampled stargazers match a low-activity profile"). We never publish a binary "fake / real" verdict.

### 5.3 Module 2: Activity Health

**Question:** Is the repository alive and operationally active?

**Inputs:**
- Commit frequency and recency (30 / 90 / 180 / 365-day windows).
- Release cadence and most-recent-release age.
- Issue first-response time (median, p90).
- Pull-request merge rate and review latency.
- Active contributors per 90-day window.
- Continuity score (variance of monthly commit count over 18 months).

**Method:** Threshold-based scoring with ecosystem-aware baselines (a Rust crate with monthly releases scores differently from a stable Python utility with quarterly releases).

### 5.4 Module 3: Maintainer Health

**Question:** Is stewardship sustainable, or does this project sit on one person's shoulders?

**Inputs:**
- Number of active maintainers in the last 365 days.
- Commit concentration (Gini coefficient of commits-per-author).
- Review concentration (Gini coefficient of PR-review actions).
- Bus factor proxy (minimum number of authors required to cover 50% of commits in last 365 days).
- Contributor retention (percent of contributors active in two consecutive 180-day windows).
- Maintainer responsiveness (median response time on issues / PRs by maintainer-flagged users).
- Ownership signals (CODEOWNERS file, MAINTAINERS.md, governance docs).

### 5.5 Module 4: Adoption Signals

**Question:** Is this repository actually used in the wild?

**Inputs (gracefully degrades when sources are unavailable):**
- GitHub `dependents` count and trend.
- Package-registry downloads (npm, PyPI, crates.io, RubyGems, Maven Central, NuGet) via deps.dev API.
- Docker Hub pulls (where applicable).
- Cited in well-known awesome-lists (configurable list).
- Documentation maturity score (presence and length of README, docs/ folder, examples/).
- Real-world reference signals (mentions in deps.dev's package-to-repo mapping graph).

This module **federates** existing public datasets rather than re-collecting them. We are explicit about this in the report.

### 5.6 Module 5: Security & Readiness

**Question:** Is this repository in a state that supports responsible adoption?

**Inputs:**
- OpenSSF Scorecard score (federated via the public API where available; we do not re-implement Scorecard's checks).
- OSV vulnerability count for the repository's published packages.
- Presence and recency of `SECURITY.md`, `CONTRIBUTING.md`, `CODE_OF_CONDUCT.md`, `LICENSE`, `CODEOWNERS`.
- CI workflow presence and basic shape.
- Release-tagging consistency (semver adherence).
- Branch-protection signals where observable via the API.

We do **not** replicate Scorecard. We import its output and weight it.

---

## 6. Score, Confidence, and Categories

### 6.1 Trust Score

A single integer 0–100, computed as the weighted average of module scores, weighted by both module weight and per-module confidence.

```
trust_score = Σ (module_weight_i × module_score_i × module_confidence_i) / Σ (module_weight_i × module_confidence_i)
```

This means a module with very low confidence contributes less to the overall score, preventing partial-data modules from dominating.

### 6.2 Confidence

Three bands: **Low / Medium / High**. Per-module confidence is determined by:

- **Data completeness** — what fraction of expected inputs were collected.
- **Sample size** — for sample-based modules (Star Authenticity), the size of the sampled population vs the configured target.
- **Cross-signal agreement** — when multiple sub-signals agree, confidence rises.
- **Staleness** — how old the cached data is relative to repo activity.

The overall report confidence is the minimum of any module that contributed >10% of the final score.

### 6.3 Categories

We bucket the score for human readability — but the bucket is always presented alongside the numeric score and the confidence band.

| Range | Category | Meaning |
| --- | --- | --- |
| 85–100 | Strong | Multiple modules score high, no significant warnings |
| 70–84 | Good | Generally healthy, watch any flagged module |
| 50–69 | Mixed | Notable strengths and notable weaknesses |
| 30–49 | Weak | Significant concerns across multiple modules |
| 0–29 | High Risk | Strong negative signals; treat as suspicious until reviewed by a human |

The category is **never** the only thing we display. A "Strong" report with low confidence is presented as `Strong (Low confidence)` and the user is told why.

---

## 7. Functional Requirements

### FR-1: Repository intake
The CLI shall accept a repository specifier in any of the following forms:
- `owner/repo` (e.g. `octocat/Hello-World`)
- Full GitHub URL (`https://github.com/octocat/Hello-World`, with or without trailing path or `.git`)
- Path to a newline-delimited file for batch mode

The CLI shall normalize, validate, and resolve renames (HTTP 301) before scanning.

### FR-2: Three execution modes
| Mode | Target latency (warm cache) | API calls (target) | Scope |
| --- | --- | --- | --- |
| `quick` | < 5s | < 30 | Repo metadata, latest activity, headline signals only |
| `standard` (default) | < 30s | < 200 | All modules at default sampling |
| `deep` | < 5min | < 2000 | Larger stargazer sample, longer historical windows, optional graph analysis |

Quick mode skips Star Authenticity (or runs it in headline-only mode) because it is the most expensive module.

### FR-3: Module selection
Users shall be able to enable / disable any module:

```
repo-trust scan owner/repo --modules activity,maintainers,security
repo-trust scan owner/repo --skip-modules stars
```

### FR-4: Output formats
- **Terminal** (default): colored summary with score, category, top-3 strengths, top-3 concerns, confidence band.
- **JSON**: full machine-readable report with stable schema.
- **Markdown**: human-friendly long-form report.
- **CSV**: tabular form, one row per repo (used for batch).
- **SARIF** (v1.1+): for security-tool integration.
- **HTML** (via `repo-trust serve`): localhost-only viewer.

### FR-5: Deterministic outputs
Same inputs (repo, mode, scoring version, weights, RNG seed) and same upstream API state shall produce byte-identical JSON output (modulo `snapshot_at` and `runtime_seconds` fields).

### FR-6: Caching and rate-limit awareness
- All API responses cached locally in SQLite at `~/.repo-trust/cache.db`.
- Default cache TTL: 24h for repo metadata, 1h for activity, 7d for stargazer pages.
- ETag-aware conditional fetching (`If-None-Match`) on every GitHub request.
- Token-bucket rate limiter coordinating concurrent requests, never exceeding 80% of remaining rate-limit budget.
- Cache management subcommands: `repo-trust cache info`, `repo-trust cache clear`, `repo-trust cache prune`.

### FR-7: Authentication
- Read `GITHUB_TOKEN` env var by default.
- Support `--token` flag (with warning that the value will be in shell history).
- Support `~/.repo-trust/config.toml` with token reference (`token_env = "GH_TOKEN"`).
- For unauthenticated runs (no token), gracefully degrade with clear warning.

### FR-8: Configuration files
Layered configuration loaded via `figment` in priority order:
1. Built-in defaults.
2. User config: `~/.repo-trust/config.toml`.
3. Project config: `./.repo-trust.toml` (in cwd).
4. Environment variables (`REPO_TRUST_*`).
5. CLI flags (highest priority).

### FR-9: Plugin reservation
Reserve a plugin interface for v1.2+. Do not expose it in v1.0 to avoid premature commitment to a stable plugin API.

---

## 8. Non-Functional Requirements

| Requirement | Target |
| --- | --- |
| **Cold-cache p95 latency (Standard mode)** | < 30s |
| **Warm-cache p95 latency (Standard mode)** | < 5s |
| **Memory** | < 200 MB for any single-repo scan; < 1 GB for batch of 100 repos |
| **Disk (cache)** | Default cap 500 MB; user-configurable; LRU eviction |
| **Cross-platform** | Linux (glibc 2.31+), macOS 13+, Windows 11 (via `cargo install`) |
| **Rust toolchain** | stable 1.75+ (MSRV); CI tests latest stable + 1.75 minimum |
| **Test coverage** | `cargo-tarpaulin` ≥ 85% on `src/scoring/` and `src/modules/`; ≥ 70% overall |
| **Documentation** | All public modules have rustdoc with examples; `cargo doc` builds clean |
| **License compliance** | `cargo-deny` allowlist enforced in CI |
| **No telemetry** | Zero outbound calls except to the configured API endpoints |

---

## 9. Distribution

| Channel | Status |
| --- | --- |
| **Cargo + crates.io** (primary): `cargo install repo-trust` | v1.0 launch |
| **Standalone binaries** (Linux x86_64, Linux arm64, macOS arm64, Windows x86_64) via `cross` cross-compilation | v1.0 launch |
| **Docker image** (`ghcr.io/dmitrze/repo-trust:latest`) | v1.0 launch |
| **Homebrew tap** (`brew install dmitrze/tap/repo-trust`) | v1.1 |
| **Winget / Scoop** (Windows) | v1.1 |
| **APT / DNF / Pacman repositories** | post-v1.0, community-maintained |

Releases follow SemVer. Pre-1.0 versions are subject to breaking changes; 1.0 stabilizes the JSON report schema.

---

## 10. CLI Surface (target)

```
repo-trust scan <repo> [options]
repo-trust batch <file> [options]
repo-trust explain <repo>
repo-trust serve [--port N] [--bind ADDR]
repo-trust cache info | clear | prune
repo-trust config show | set <key> <value>
repo-trust version
repo-trust completions <shell>
```

Sample `scan` flags:
```
--mode quick|standard|deep         (default: standard)
--modules <comma-separated>        (default: all)
--skip-modules <comma-separated>
--output <dir>                     (default: ./repo-trust-reports/)
--format <terminal|json|md|csv>... (multi-select; default: terminal)
--weights <path-to-toml>
--scoring-version <semver>         (pin scoring version)
--token <value>                    (or use $GITHUB_TOKEN)
--seed <u64>                       (RNG seed for sampling, default derived from repo)
--refresh                          (invalidate cache)
--refresh-module <name>            (invalidate specific module's cache)
--debug                            (verbose tracing logs)
--quiet                            (no progress output)
--no-color                         (disable terminal colors)
--json                             (alias for --format json --quiet)
```

---

## 11. Reporting Format (target JSON shape)

```json
{
  "schema_version": "1.0.0",
  "repository": {
    "full_name": "owner/repo",
    "url": "https://github.com/owner/repo",
    "default_branch": "main",
    "primary_language": "Rust",
    "stars": 12345,
    "snapshot_at": "2026-05-15T10:00:00Z"
  },
  "overall_score": 73,
  "overall_confidence": "Medium",
  "category": "Good",
  "modules": [
    {
      "module": "stars",
      "score": 81,
      "confidence": "High",
      "sub_scores": { "low_activity_share": 75, "lockstep_timing": 90 },
      "sample_size": 200,
      "missing_data": []
    }
  ],
  "evidence": [
    {
      "module": "stars",
      "code": "low_activity_stargazer_share",
      "label": "Share of low-activity stargazer accounts",
      "value": 0.082,
      "threshold": 0.20,
      "verdict": "Positive",
      "rationale": "8.2% of sampled stargazers match the low-activity profile, well below our 20% concern threshold."
    }
  ],
  "top_strengths": [],
  "top_concerns": [],
  "caveats": ["Stargazer sample limited to 200 due to API rate limit"],
  "scoring_version": "1.0.0",
  "weights_used": { "stars": 0.20, "activity": 0.25, "maintainers": 0.20, "adoption": 0.20, "security": 0.15 },
  "snapshot_at": "2026-05-15T10:00:00Z",
  "runtime_seconds": 12.3
}
```

The schema is **frozen** within a major version. Breaking schema changes require a major version bump and a documented migration path.

---

## 12. Roadmap

### Phase 0 — Research foundation (✅ this PRD and architecture)
Outputs: PRD, architecture, methodology plan, ADRs 0001-0010.

### Phase 1 — Core CLI MVP (target: 6–8 weeks)
- Cargo workspace, CLI skeleton (`scan` only), configuration loading, SQLite cache.
- Activity Health, Maintainer Health, Security & Readiness modules (the three least research-dependent).
- JSON and Markdown report writers.
- Quick + Standard modes.
- ≥ 70% test coverage.

### Phase 2 — Stars + Adoption (4–6 weeks)
- Star Authenticity module with StarScout-style heuristics.
- Adoption Signals module via deps.dev integration.
- Deep mode for stargazer sampling.
- Property tests on scoring functions.

### Phase 3 — Polish + viewer (4 weeks)
- `repo-trust serve` axum web viewer.
- Terminal report polish (`comfy-table`, color output).
- CSV + SARIF outputs.
- crates.io release of v1.0.0 + Homebrew tap + standalone binaries.

### Phase 4 — Adoption (ongoing)
- Apply to GitHub Secure Open Source Fund.
- Apply to Tidelift.
- GitHub Sponsors page activation.
- Conference talk submissions (FOSDEM, OSSummit, RustConf).
- Category-aware baselines (Rust crate vs Python lib vs JS framework).

---

## 13. Success Metrics

| Metric | 3-month target | 12-month target |
| --- | --- | --- |
| GitHub stars (this repo) | 200 | 2,500 |
| Weekly crates.io downloads | 500 | 5,000 |
| Active GitHub Sponsors | 5 | 50 |
| External contributors | 3 | 25 |
| Citations in blog posts / papers | 5 | 50 |
| Reports filed via the issue tracker | 10 | 100 |
| Mean time to triage a methodology question | < 7 days | < 3 days |

---

## 14. Risks and Mitigations

| Risk | Likelihood | Impact | Mitigation |
| --- | --- | --- | --- |
| GitHub deprecates an API we depend on | Medium | High | Federate via deps.dev where possible; abstract API layer |
| GitHub tightens rate limits | Medium | Medium | ETag caching; deep mode is opt-in; clear docs on token use |
| False-positive "fake star" flag damages a real project | Low | High | Conservative thresholds; never use word "fraud"; allow disputes |
| Tool is misused for scoring contests / shaming | Medium | Medium | Methodology and confidence are foregrounded; we publish principles |
| Maintainer burnout (bus factor 1) | High | High | Transparent governance; recruit co-maintainers; sponsor revenue |
| Competing tool with VC funding launches | Medium | Low | Open source is durable; methodology rigor is the moat |

---

## 15. Open Questions

These are deliberately unresolved at the PRD stage; they will be tracked in `docs/adr/` as we make decisions:

1. Should we ship a default benchmark repo set in `examples/`, or only a methodology + curation guide?
2. Should Star Authenticity have a "headline" sub-mode that runs in `quick` without sampling?
3. How should we handle monorepos (`microsoft/vscode-extensions/something`)?
4. Should we publish a public scoreboard site? (Tentative answer: no, not in v1; risk of misuse.)
5. Should we accept anonymous methodology disputes? (Tentative answer: yes, but require evidence.)

---

*This document supersedes any prior README content about scope. Architecture details live in `docs/architecture.md`.*