Expand description
This crates provides means to parse MARC21 records. It supports normal MARC21 records using either MARC-8 (for latin languages) or Unicode and tries to transform as much as possible into strings. It doesn’t interpret the field data much, so lookup from tag numbers will be required
Info about the format can be found here: https://www.loc.gov/marc/bibliographic/
The general structure of a MARC record is as follows:
A file can contain many MARC records. Each records has the following parts:
- a leader: a header that contains info about the structure of the record;
- a directory: an index of the various fields;
- fields, which can either be control fields or data fields
All the fields have an identifying tag.
Control fields simply contain ASCII data.
Each data field can have a 2-character set of indicators, for which some meaning can be derived.
They also contain a list of subfields which are identified by a single ASCII character.
The only entrypoint to the library is the parse_records function:
use marc_record::parse_records;
let binary_data = include_bytes!("../samples/marc8_multiple.mrc");
let records = parse_records(binary_data).unwrap();
assert_eq!(records.len(), 109);
Modules§
- marc8
- MARC-8 support for MARC records
Structs§
- Control
Field - The first fields of a MARC record represent control data. Unlike the variable data fields, they are simple blob of ASCII content, although some of them are encoded with a specific scheme (for example, some of them are pipe-separated values)
- Data
Field - One of the variable data fields representing the bulk of the information found in the MARC record. The tag along with the indicators (of which there are typically 2) help figure out the specific meaning of the data found within the subfields or blocks of content.
- Directory
Entry - Field
Tag - Leader
- The leader is MARC’s equivalent of a header. It contains internal bookkeeping info about the record as well as some information of interest to the applications reading it.
- Record
- A MARC record describes a content or piece of content using a series of fields.
Fields are identified by a three-digit ASCII code (e.g.
001
) and contain either control or field data. Control fields are identified by being in the 000-099 range while data fields are all the others. Control fields contain a single piece of information - Subfield
- Subfield
Tag - A type representing the function of a block of content within a variable data field. Typically a single character.
Enums§
- Bibliographical
Level - Cataloging
Form - Coding
Scheme - Control
Type - Encoding
Level - Error
- Field
- Multipart
Resource Record Level - Record
Type - Status
Functions§
- parse_
records - Parse a set of MARC records from bytes This requires bytes because a MARC record can have various encodings, UTF-8 and MARC-8 being the most common ones. It assumes all the records are valid and complete and will also fail if any extra content is found.