Skip to main content

Module extension_validation

Module extension_validation 

Source
Expand description

Deterministic classifier and deduplication engine for Pi extension candidates.

This module takes mixed-source research data (GitHub code search, repo search, npm scan, curated lists) and produces a validated, deduplicated candidate set.

Each candidate gets:

  • A ValidationStatus (true-extension, mention-only, unknown)
  • ValidationEvidence (which signals matched)
  • A canonical identity key for deduplication

The classifier is intentionally conservative: a candidate must show clear Pi extension API usage to be classified as TrueExtension.

Structs§

CodeSearchEntry
A candidate from the GitHub code search inventory.
CodeSearchInventory
Wrapper for code search inventory JSON.
CuratedListEntry
A candidate from the curated list sweep.
CuratedListSummary
Wrapper for curated list summary JSON.
NpmScanEntry
A candidate from the npm scan.
NpmScanSummary
Wrapper for npm scan summary JSON.
RepoSearchEntry
A candidate from the GitHub repo search.
RepoSearchSummary
Wrapper for repo search summary JSON.
ValidatedCandidate
A fully validated candidate with classification and dedup info.
ValidationConfig
Configuration for the validation pipeline.
ValidationEvidence
Evidence supporting a validation decision.
ValidationReport
Output of the full validation + dedup pipeline.
ValidationStats
Aggregate statistics.

Enums§

ValidationStatus
Validation status for a candidate.

Functions§

canonical_id_from_npm
Generate a canonical ID from an npm package name. Prefixed with npm: to distinguish from GitHub repos.
canonical_id_from_repo_slug
Generate a canonical ID from a GitHub repo slug (e.g. “owner/repo”).
canonical_id_from_repo_url
Extract a canonical ID from a GitHub repository URL. Returns owner/repo in lowercase, or None if not a GitHub URL.
chrono_now_iso
Simple ISO timestamp (avoids pulling in chrono).
classify_from_evidence
Classify a candidate based on code-level evidence.
classify_source_content
Classify extension source content (raw TypeScript/JavaScript).
normalize_github_repo
Normalize a GitHub repo slug to lowercase owner/repo.
run_validation_pipeline
Run the full validation + dedup pipeline on all research sources.