biblib
A Rust library for parsing and deduplicating academic citations.
Installation
[]
= "0.3.0"
For minimal builds:
[]
= { = "0.3.0", = false, = ["ris"] }
Supported Formats
| Format | Feature | Description |
|---|---|---|
| RIS | ris |
Research Information Systems format |
| PubMed | pubmed |
MEDLINE/PubMed .nbib files |
| EndNote XML | xml |
EndNote XML export format |
| CSV | csv |
Configurable delimited files |
All format features are enabled by default.
Quick Start
Parsing Citations
use ;
let ris_content = r#"TY - JOUR
TI - Machine Learning in Healthcare
AU - Smith, John
AU - Doe, Jane
PY - 2023
ER -"#;
let parser = new;
let citations = parser.parse.unwrap;
println!;
println!;
Auto-Detecting Format
use detect_and_parse;
let content = "TY - JOUR\nTI - Example\nER -";
let = detect_and_parse.unwrap;
println!; // "RIS"
Deduplicating Citations
use ;
let config = DeduplicatorConfig ;
let deduplicator = new.with_config;
let groups = deduplicator.find_duplicates.unwrap;
for group in groups
CSV with Custom Headers
use ;
use CitationParser;
let mut config = new;
config
.set_delimiter
.set_header_mapping
.set_header_mapping;
let parser = with_config;
let citations = parser.parse.unwrap;
Citation Fields
Each parsed citation contains:
| Field | Type | Description |
|---|---|---|
title |
String |
Work title |
authors |
Vec<Author> |
Authors with name, given name, affiliations |
journal |
Option<String> |
Full journal name |
journal_abbr |
Option<String> |
Journal abbreviation |
date |
Option<Date> |
Year, month, day |
volume |
Option<String> |
Volume number |
issue |
Option<String> |
Issue number |
pages |
Option<String> |
Page range |
doi |
Option<String> |
Digital Object Identifier |
pmid |
Option<String> |
PubMed ID |
pmc_id |
Option<String> |
PubMed Central ID |
issn |
Vec<String> |
ISSNs |
abstract_text |
Option<String> |
Abstract |
keywords |
Vec<String> |
Keywords |
urls |
Vec<String> |
Related URLs |
mesh_terms |
Vec<String> |
MeSH terms (PubMed) |
extra_fields |
HashMap |
Additional format-specific fields |
Features
| Feature | Dependencies | Description |
|---|---|---|
ris |
- | RIS format parser |
pubmed |
- | PubMed/MEDLINE parser |
xml |
quick-xml |
EndNote XML parser |
csv |
csv |
CSV parser |
dedupe |
rayon, strsim |
Deduplication engine |
regex |
regex |
Full regex support |
lite |
regex-lite |
Lightweight regex (smaller binary) |
Default: all features enabled except lite.
Documentation
- Parsing Guide — Format-specific tag mappings, date formats, and author handling
- Deduplication Guide — Matching algorithm, similarity thresholds, and configuration
- API Docs — Complete API reference
License
Licensed under either of Apache License 2.0 or MIT at your option.