Expand description
§doiget-core
Core library for doiget: an Open Access first paper-fetcher with strict capability gating, fail-closed provenance logging, and a BiblioFetch.jl-compatible store layout.
Phase 0 ships only this skeleton. Real implementations land in Phase 1.
See docs/PUBLIC_API.md for the semver-locked surface and docs/ARCHITECTURE.md
for the high-level design.
Re-exports§
pub use crate::canonical::CanonicalRef;pub use crate::canonical::SourceType;
Modules§
- canonical
- Canonical-tuple audit identity for fetched papers (ADR-0021 §1, ADR-0024).
- dry_run
- Dry-run preview shape for
--dry-runCLI fetches and thedoiget_metadata_only/doiget_fetch_paperMCP tools. - http
- Centralized HTTP client wrapper. All
Sourceimpls fetch through here. - orchestrator
- Cross-source orchestrators that compose multiple
Sourceimpls into a single user-facing operation. - provenance
- JSON Lines + SHA-256 hash-chained provenance log.
- rate_
limiter - Process-wide rate limiter for HTTP fetches across all
Sourceimpls. - source
- Source abstraction. Each Tier 1/2/3 fetcher implements this trait.
- sources
- Source implementations.
- store
- Filesystem-backed metadata store.
Structs§
- Always
On - Marker for the always-on Open Access tier. See
docs/CAPABILITY.md. - ArxivId
- A validated arXiv id string.
- Capability
Profile - Runtime gate for which sources may be invoked. See
docs/CAPABILITY.md. - Denial
Context - Structured machine-parseable companion to
error.messagefor recoverable denials. - Doi
- A validated DOI string.
- Metadata
Access - Which Tier 2 metadata sources are enabled this session. See
docs/CAPABILITY.md. - Rate
Limits - Process-wide rate limits. Hard-coded; not configurable.
- Safekey
- A filesystem-safe key derived deterministically from a
Ref. - TdmGrant
- A successful TDM grant.
Enums§
- Capability
Error - Errors that can arise during
CapabilityProfile::from_env. - Denial
Reason - Closed-set reasons a denial-class error envelope can carry on its
optional
denial_context.reasonfield. - Error
Code - The closed set of error codes doiget surfaces.
- Ref
- A reference to a paper, either by DOI or arXiv id.
- RefParse
Error - Reasons a
Doi::parse/ArxivId::parse/Ref::parsecall can fail.
Constants§
- CITATION_
CACHE_ TTL_ DAYS - Time-to-live for entries in
~/.cache/doiget/citations/. Seedocs/CACHE.md§3. - DOI_
SUFFIX_ MAX_ LEN - Maximum DOI suffix length accepted at validation. See
docs/SECURITY.md§1.1. - MAX_
BATCH_ REFS - Slice 2 alias for
MCP_BATCH_MAX_SIZEusing the spec-language name (docs/MCP_TOOLS.md§1 / Slice 2 plan). The numeric value MUST equalMCP_BATCH_MAX_SIZE; an internal test pins the equivalence so the two constants cannot drift. - MAX_
CONCURRENT_ FETCHES - Hard-coded rate limit. See
docs/LEGAL.md§6 safeguard 8. - MAX_
FETCHES_ PER_ SECOND - Hard-coded rate limit. See
docs/LEGAL.md§6 safeguard 8. - MCP_
BATCH_ MAX_ SIZE - Maximum batch size for
doiget batchanddoiget_batch_fetch. - MCP_
QUEUE_ DEPTH_ MAX - Maximum queued MCP requests beyond
MAX_CONCURRENT_FETCHES. Excess returnsErrorCode::RateLimited. Seedocs/SECURITY.md§1.4 /docs/MCP_TOOLS.md. - MCP_
STDIN_ EOF_ SHUTDOWN_ SEC - MCP server stdin-EOF graceful-shutdown deadline, in seconds. See ADR-0001
and
docs/MCP_TOOLS.md§8. - PDF_
MAX_ BYTES - Maximum PDF body size accepted by the fetcher, in bytes. See
docs/SECURITY.md§1.2 (Oversized PDF). - RESOLVER_
CACHE_ TTL_ DAYS - Time-to-live for entries in
~/.cache/doiget/resolver/. Seedocs/CACHE.md§3. - SCHEMA_
VERSION - TOML schema version this build writes. See
docs/STORE.md§3. - VERSION
- Crate version. Used by
doiget-cli --versionanddoiget_health.