Expand description
commonmeta — a Rust port of front-matter/commonmeta.
Convert scholarly metadata between formats. The native model is Data;
format modules read into it and write out of it.
Re-exports§
Modules§
- author_
utils - constants
- Controlled vocabularies and cross-format type/role translation tables.
- crockford
- Generate, encode and decode random base32 identifiers. This encoder/decoder:
- crossref
- data
- Core Commonmeta data model.
- doi_
utils - Utilities for working with DOIs
- error
- file_
utils - progress
- schema_
utils - JSON Schema and XSD validation utilities.
- spdx
- SPDX license vocabulary lookup.
- utils
- vocabularies
- Embedded controlled vocabulary data files.
Structs§
- Affiliation
Match - A single match result from the ROR affiliation API.
- Push
Result - The outcome of pushing a single record to InvenioRDM.
Constants§
Functions§
- convert
- Read from one format and write to another in a single call.
- convert_
citation - Like
convert, but passes CSLstyleandlocalethrough to the citation writer. - fetch_
vraix_ dump - Fetch commonmeta records from a VRAIX daily dump for
from(“crossref” or “datacite”) anddate(YYYY-MM-DD). - match_
ror_ affiliation - Match a free-text affiliation string against ROR organizations using the ROR v2 affiliation endpoint.
- push_
inveniordm - Create-or-update, then publish, a list of records in InvenioRDM.
- put_
inveniordm - Create-or-update, then publish, a single record in InvenioRDM.
- read
- Read a single record from
fromformat, without writing it back out. - read_
parquet - Read a list of commonmeta records back from the Parquet schema written by
write_parquet. Lossless: each record is restored from itsjsoncolumn, the complete original serialization. - read_
sqlite_ commonmeta - Read records from a commonmeta SQLite database written by
write_sqlite. - read_
vraix_ sqlite - Read commonmeta records from a VRAIX daily dump SQLite file already on
disk at
sqlite_path, e.g. an already-downloadedcrossref-2026-06-14.sqlite3. - stream_
vraix_ to_ sqlite - Stream a VRAIX daily dump at
input_pathdirectly to a commonmeta SQLite database atoutput_pathin batches of 10 000 rows, converting withfrom-specific parser and writing each batch in a single transaction.limitcaps total records written; pass0for all rows. Returns the number of records written. NoVec<Data>is held for the whole file — peak memory is proportional to one batch, not the whole dump. - write
- Write an already-loaded record to
toformat. - write_
archive - Render
listtotoformat, split into entries of at mostbatch_sizerecords each — suitable for packing into an archive viafile_utils::write_zip_archive/file_utils::write_tar_gz_archive.base_name(e.g."out.json") names the single entry directly when there’s only one batch, or gets a numbered suffix ("out-00000.json","out-00001.json", …) when there are several. - write_
archive_ citation - Like
write_archive, but passes CSLstyle/localethrough to the citation writer whento == "citation". - write_
list - Render a list of records to
toformat as a single buffer: a JSON array for object-shaped formats (commonmeta,csl,datacite,inveniordm,schemaorg,ror), or newline-joined output for line/document-shaped formats (e.g.bibtex,ris,crossref_xml). - write_
list_ citation - Like
write_list, but passes CSLstyle/localethrough to the citation writer whento == "citation"(ignored for every other format, same asconvert_citation/write_citation). - write_
parquet - Write a list of commonmeta records as a single Parquet file. Alongside a
flattened tabular projection of each record’s fields (for filtering in
tools like DuckDB without parsing JSON), every row also carries a
jsoncolumn with the record’s complete serialization, soread_parquetround-trips losslessly. - write_
ror_ json - Write a ROR-derived record as raw ROR-shaped JSON (as opposed to
write("ror", data), which produces InvenioRDM vocabulary YAML). - write_
sqlite - Write
listas a SQLite3 database with aworkstable whose columns mirror the commonmeta v1.0 schema. Simple string fields are stored as TEXT; complex fields are stored as compact JSON TEXT. - write_
vraix_ table_ parquet - Write a VRAIX dump’s transport table (e.g.
pid_records) to a single Parquet file’s bytes, using its raw columns (pid,source_id,raw_metadata, …) as-is — not converted to commonmetaDatathe wayread_vraix_sqliteis. For analytics over the dump itself (e.g. via DataFusion/Polars/DuckDB), not for ingesting it as commonmeta records.batch_sizecontrols how many rows land in each internal Parquet row group (see [formats::commonmeta::write_parquet_all]’s analogousROW_GROUP_SIZEfor why this matters for large dumps).