Crate marc_record

Source
Expand description

This crates provides means to parse MARC21 records. It supports normal MARC21 records using either MARC-8 (for latin languages) or Unicode and tries to transform as much as possible into strings. It doesn’t interpret the field data much, so lookup from tag numbers will be required

Info about the format can be found here: https://www.loc.gov/marc/bibliographic/

The general structure of a MARC record is as follows:

A file can contain many MARC records. Each records has the following parts:

  • a leader: a header that contains info about the structure of the record;
  • a directory: an index of the various fields;
  • fields, which can either be control fields or data fields

All the fields have an identifying tag.

Control fields simply contain ASCII data.

Each data field can have a 2-character set of indicators, for which some meaning can be derived.

They also contain a list of subfields which are identified by a single ASCII character.

The only entrypoint to the library is the parse_records function:

use marc_record::parse_records;

let binary_data = include_bytes!("../samples/marc8_multiple.mrc");
let records = parse_records(binary_data).unwrap();
assert_eq!(records.len(), 109);

Modules§

marc8
MARC-8 support for MARC records

Structs§

ControlField
The first fields of a MARC record represent control data. Unlike the variable data fields, they are simple blob of ASCII content, although some of them are encoded with a specific scheme (for example, some of them are pipe-separated values)
DataField
One of the variable data fields representing the bulk of the information found in the MARC record. The tag along with the indicators (of which there are typically 2) help figure out the specific meaning of the data found within the subfields or blocks of content.
DirectoryEntry
FieldTag
Leader
The leader is MARC’s equivalent of a header. It contains internal bookkeeping info about the record as well as some information of interest to the applications reading it.
Record
A MARC record describes a content or piece of content using a series of fields. Fields are identified by a three-digit ASCII code (e.g. 001) and contain either control or field data. Control fields are identified by being in the 000-099 range while data fields are all the others. Control fields contain a single piece of information
Subfield
SubfieldTag
A type representing the function of a block of content within a variable data field. Typically a single character.

Enums§

BibliographicalLevel
CatalogingForm
CodingScheme
ControlType
EncodingLevel
Error
Field
MultipartResourceRecordLevel
RecordType
Status

Functions§

parse_records
Parse a set of MARC records from bytes This requires bytes because a MARC record can have various encodings, UTF-8 and MARC-8 being the most common ones. It assumes all the records are valid and complete and will also fail if any extra content is found.