bibtex-parser
Fast BibTeX parsing for Rust applications that need both throughput and a real user-facing API.
bibtex-parser parses strict BibTeX at high speed, expands string variables and month constants, exposes top-level block order, provides ergonomic query/edit helpers, and writes BibTeX back out with configurable formatting.
Features
- High-throughput single-file parsing with borrowed values where possible.
- Strict parsing by default, with explicit tolerant recovery for malformed corpora.
- String definitions, concatenation, month constants, preambles, comments, and ordered blocks.
- Source-span capture for entries and recovered failures when requested.
- Case-insensitive lookup, duplicate detection, DOI normalization, field normalization, sorting, and validation.
- Structured author/editor name parsing and typed entry helpers.
- Configurable writer for serializing, formatting, sorting, and writing files.
- Optional
parallelfeature for parsing multiple files concurrently. - Optional
latex_to_unicodefeature for LaTeX accent conversion helpers.
Install
[]
= "0.1"
Parse
use ;
Query And Edit
use ;
Tolerant Parsing
Strict parsing is the default. Tolerant parsing is opt-in and keeps malformed blocks separate from valid entries.
use ;
Write
use ;
For simple cases:
let bibtex = library.to_bibtex?;
library.write_file?;
Feature Flags
[]
= { = "0.1", = ["parallel", "latex_to_unicode"] }
parallel: enables Rayon-backedParser::parse_filesfor multi-file workloads. Single-file parsing remains sequential.latex_to_unicode: enables LaTeX accent-to-Unicode conversion helpers.
Semantics
Library::parseandParser::parseare strict by default and return an error on malformed BibTeX.Parser::tolerant()recovers valid blocks after malformed input and records failures inLibrary::failed_blocks().- String definitions and concatenations are expanded for the
LibraryAPI. Useparse_bibtexwhen you need raw parsed items. - Comments, preambles, strings, entries, and tolerant failures are available through
Library::blocks()in source order. - Writer defaults preserve library block order. Sorting and alignment are explicit formatting choices.
Performance
The repository includes Criterion benchmarks for parser throughput, common library operations, and memory-oriented workloads. Exact numbers depend on CPU, compiler, governor, and thermal state, so measure on the machine that matters for your workload.
License
Licensed under either of Apache-2.0 or MIT, at your option.