marc-rs
A Rust library for parsing and writing MARC21, UNIMARC, and MARC XML bibliographic records.
Features
- Support for MARC21, UNIMARC, and MARC XML formats
- Multiple character encodings (UTF-8, MARC-8, ISO-8859-*, ISO-5426)
- Parse multiple records from a single buffer
- Write single or multiple records
- Optional Serde support for serialization/deserialization
- Comprehensive field type enums organized by category
Installation
Add this to your Cargo.toml:
[]
= "0.1.0"
# Optional: Enable Serde support
= { = "0.1.0", = ["serde"] }
Usage
Parsing MARC21 Records
use ;
let data = b"..."; // MARC binary data
let format_encoding = new;
let records = parse?;
for record in records
Writing MARC XML
use ;
use stdout;
let records = vec!;
let format_encoding = marc_xml;
write?;
Using Field Enums
use ;
// Tags depend on the format
let format = Marc21;
let main_entry_tag = PersonalName.tag; // "100" in MARC21, "700" in UNIMARC
let title_tag = TitleStatement.tag; // "245" in MARC21, "200" in UNIMARC
let subject_tag = SubjectTopicalTerm.tag; // "650" in MARC21, "606" in UNIMARC
Serde Support
With the serde feature enabled, you can serialize/deserialize directly to/from MARC formats:
use ;
use File;
// Deserialize from bytes
let data = b"..."; // MARC binary data
let format = new;
let record = from_slice?;
// Deserialize from reader
let file = open?;
let record = from_reader?;
// Deserialize from string (for XML)
let xml = r#"<?xml version="1.0"?><record>...</record>"#;
let xml_format = marc_xml;
let record = from_str?;
// Serialize to bytes
let bytes = to_vec?;
// Serialize to writer
let mut output = Vecnew;
to_writer?;
// Serialize to string (for XML)
let xml_string = to_string?;
// Multiple records
let records = from_slice_many?;
let bytes = to_records?;
// Or use JSON for cross-format serialization
use serde_json;
let json = to_string?;
let deserialized: Record = from_str?;
Format Support
MARC21
- Binary format parsing and writing
- XML format parsing and writing
- Default encoding: MARC-8
UNIMARC
- Binary format parsing and writing
- XML format parsing and writing
- Default encoding: UTF-8
MARC XML
- Full XML parsing with collection support
- XML writing with automatic collection wrapping for multiple records
Character Encodings
Supported encodings:
- UTF-8
- MARC-8 (fallback to ISO-8859-1)
- ISO-8859-1 (Latin-1)
- ISO-8859-2 (Latin-2)
- ISO-8859-5 (Cyrillic)
- ISO-8859-7 (Greek)
- ISO-8859-15 (Latin-9)
- ISO-5426 (Extension of the Latin alphabet for bibliographic information interchange)
Field Categories
The library provides enums for different field categories:
- Main Entry (1XX): Personal names, corporate names, meeting names, uniform titles
- Title (20X-24X): Title statements, varying forms, former titles
- Edition (25X): Edition statements, cartographic data, computer file characteristics
- Physical Description (3XX): Physical descriptions, playing time, publication frequency
- Series (4XX): Series statements and added entries
- Notes (5XX): General notes, contents notes, summary, etc.
- Subject Access (6XX): Subject headings, topical terms, geographic names
- Added Entries (70X-75X): Personal names, corporate names, uniform titles
- Linking Entries (76X-78X): Series entries, translation entries, related entries
- Control Fields (00X): Control numbers, fixed-length data elements
Command Line Tool
The crate includes a command-line viewer tool to inspect MARC files:
# Build the viewer (requires serde feature)
# View a MARC file (auto-detect format, plain output)
# Specify format explicitly
# Specify format and encoding
# Output in JSON format
# Output in XML format
# Output in MARC21 binary format
# Output in UNIMARC binary format
The viewer supports five output formats:
- plain (default): Human-readable text format with leader, control fields, and data fields
- json: JSON serialization using serde_json
- xml: MARC XML format using serde_marc
- marc or marc21: MARC21 binary format using serde_marc (outputs to stdout)
- unimarc: UNIMARC binary format using serde_marc (outputs to stdout)
The plain format displays:
- File information and detected format
- Leader information
- All control fields (001-009)
- All data fields with indicators and subfields
References
License
MIT OR Apache-2.0