Expand description
indexkit – index constituent service for Rust.
Daily / monthly snapshots of the S&P 500, S&P MidCap 400, S&P SmallCap 600, Nasdaq-100, and Dow Jones Industrial Average, served from bundled parquet files with runtime GitHub fetch and local cache. No API keys. Offline after the first successful fetch.
Data sources layer by priority (see types::DataSource):
sponsor CDNs (iShares / Invesco / SPDR) > OSS GitHub mirrors
(fja05680/sp500, yfiua/index-constituents,
hanshof/sp500_constituents) > Internet Archive Wayback > SEC EDGAR
N-PORT. The S&P 500 now has daily rows from 1996-01-02 onward via the
GitHub mirrors; other indices still start 2019-11 via N-PORT (plus
sponsor-CDN forward-going).
§Quick start – one-off scripts
use indexkit::{ym, IndexId};
#[tokio::main]
async fn main() -> indexkit::Result<()> {
// Free functions -- no client setup needed
let sp500 = indexkit::sp500_latest().await?;
let ndx = indexkit::constituents_for(IndexId::Ndx, ym!(2024, 1)).await?;
println!("S&P 500 latest: {} holdings", sp500.len());
println!("Top: {} at {:.2}%", sp500[0].name, sp500[0].weight * 100.0);
println!("NDX Jan 2024: {} holdings", ndx.len());
Ok(())
}§Client pattern – connection pool + cache reuse
use indexkit::{Indexkit, ym, YearMonth};
#[tokio::main]
async fn main() -> indexkit::Result<()> {
let client = Indexkit::new(); // infallible, no ?
// Any month form works -- no chrono import needed
let a = client.sp500("2024-01").await?;
let b = client.sp500(202401u32).await?;
let c = client.sp500((2024i32, 1u32)).await?;
let d = client.sp500(ym!(2024, 1)).await?;
let e = client.sp500(YearMonth::new(2024, 1)?).await?;
// All equivalent
assert_eq!(a.len(), b.len());
assert_eq!(c.len(), d.len());
let _ = e;
Ok(())
}§Major types
Indexkit– stateful client; create once, call many times.YearMonth– year-month newtype; accepts strings, integers, tuples.Constituent– one holding.IndexSnapshot– constituents + metadata for one month.IndexId– typed index identifier (Sp500, Sp400, Sp600, Ndx, Dji).Error– unified error type; match on this, never on sub-types.
§Environment overrides
| Variable | Effect |
|---|---|
INDEXKIT_BASE_URL | Replace the GitHub raw origin URL |
INDEXKIT_CACHE_DIR | Override ~/.cache/indexkit/ |
INDEXKIT_MIRROR_URL | CDN mirror fallback URL (default: jsDelivr) |
§Field coverage per source
Which columns a given Constituent carries depends on the row’s
DataSource. Sponsor-CDN / Wayback / N-PORT rows are full-field
(weight, shares, market value, CUSIP). GitHub mirror rows
(DataSource::GithubFja05680, DataSource::GithubYfiua,
DataSource::GithubHanshof) are ticker-only: weight is
f64::NAN, cusip is empty, shares / market_value_usd are 0.0.
Use Constituent::weight_opt for an Option<f64> accessor that
returns None on NaN.
§Limitations (v1.0.x)
- No ticker from N-PORT: SEC N-PORT does not include ticker
symbols.
SecNportrows setConstituent::tickertoNone. GitHub mirror rows populate ticker. - No weight/shares from GitHub mirrors: the three GitHub OSS mirrors are ticker-only. They provide composition history but no per-holding weight vector.
- No GICS sector: reserved for v1.1 via SIC -> GICS cross-walk.
- 60-90 day filing lag for N-PORT: unavoidable regulatory constraint. GitHub mirrors and sponsor-CDN close the recency gap.
§Modules
client–Indexkitasync client with blocking wrappers.date–YearMonthnewtype for month inputs.types–Constituent,IndexSnapshot,IndexId,types::DataSource.github_mirror– OSS GitHub CSV fetchers (fja05680, yfiua, hanshof) with ticker parsers and forward-fill helper.nport– N-PORTprimary_doc.xmlparser.sponsor– sponsor-CDN CSV parsers.wayback– Internet Archive CDX + snapshot client.cik– ETF -> CIK / series mapping (verified against live SEC).parquet_io– parquet writer + reader.sec– SEC EDGAR client used by the CLI for backfill.coalesce– merge rows from multiple sources into one snapshot.error– unifiedErrorenum andResultalias.
Re-exports§
pub use client::Indexkit;pub use date::IntoYearMonth;pub use date::YearMonth;pub use date::YearMonthError;pub use error::Error;pub use error::IndexkitError;pub use error::Result;pub use nport::holdings_to_constituents;pub use nport::parse_nport;pub use nport::NportFiling;pub use nport::NportHeader;pub use nport::RawHolding;pub use sec::FilingRef;pub use sec::SecClient;pub use sponsor::parse_invesco_csv;pub use sponsor::SponsorClient;pub use types::Constituent;pub use types::DailySnapshot;pub use types::DataSource;pub use types::IndexId;pub use types::IndexSnapshot;pub use types::Resolution;pub use types::Sector;pub use wayback::WaybackClient;pub use wayback::WaybackMatch;
Modules§
- cik
- ETF -> CIK / series mapping for the five supported indices.
- client
- Stateful
Indexkitclient – flat async endpoint methods. - coalesce
- Merge rows from multiple sources into a single coherent snapshot.
- date
YearMonth– month resolution for index constituent snapshots.- error
- Unified error type for indexkit.
- github_
mirror - GitHub-mirror historical constituent ingestion.
- nport
- SEC N-PORT
primary_doc.xmlparser. - parquet_
io - Parquet reader / writer for monthly index constituent snapshots.
- sec
- SEC EDGAR client for N-PORT filings.
- sponsor
- Sponsor-CDN holdings-file parsers (iShares, Invesco, SPDR) and the Internet Archive Wayback Machine bridge.
- types
- Core domain types –
Constituent,IndexSnapshot,IndexId,DataSource,Resolution. - wayback
- Internet Archive Wayback Machine client.
Macros§
Functions§
- constituents_
for - Constituents for any index at any month (uses shared global client).
- dji_
latest - Latest DJIA snapshot (uses shared global client).
- ndx_
latest - Latest Nasdaq-100 snapshot (uses shared global client).
- sp500_
latest - Latest S&P 500 snapshot (uses shared global client).
- sp500_
tickers_ latest - Latest S&P 500 ticker list (uses shared global client).