Skip to main content

Module mbox_parser

Module mbox_parser 

Source
Expand description

Parser for mbox-format git patch content.

This module is a fallback path used only when nostr patch events are missing optional tags (author, committer, description, parent-commit). When those tags are present they always take precedence — see crate::git::RepoActions::apply_patch_chain.

§Why hand-rolled rather than a library?

Neither libgit2 (via the git2 crate) nor gitoxide (gix) exposes a mailinfo-style parser. libgit2’s email API is output-only (git_email_create_from_commit); there is no git_mailinfo equivalent. The gitoxide monorepo has no gix-patch crate, not even as a placeholder. No production-quality standalone Rust mbox/git-patch parser crate exists.

The genuinely hard parts of RFC 2822 parsing (header folding, RFC 2047 MIME encoded-words for non-ASCII author names and subjects) are delegated to the mailparse crate. The git-specific overlay (mbox envelope line, [PATCH] prefix stripping, commit-message body extraction up to the --- diffstat separator) is implemented here, matching the behaviour of git am’s patchbreak() function in mailinfo.c.

§If edge cases are reported

If real-world patches produce incorrect metadata through this parser, the escape hatch is to shell out to git mailinfo directly:

git mailinfo /tmp/msg /tmp/patch < input.patch

This prints Author:, Email:, Subject:, Date: to stdout and writes the commit body to /tmp/msg. Since ngit already requires git in PATH (it is a git plugin), this is always available. It is not the primary approach because it requires two temp files and a process spawn per patch, which is acceptable cost but unnecessary given that most patches in the ngit pr/ flow will have the optional nostr tags and never reach this code.

§Known limitation: --- in commit message body

The --- line that separates the commit message from the diffstat is ambiguous when the commit message itself contains --- (e.g. Markdown horizontal rules). This parser stops at the first ----only line, matching git am’s own behaviour — git am has the same limitation and documents it. This is not a bug we can fix without lookahead into the diff structure.

§Commit ID from mbox envelope

The SHA1 in the mbox From <sha1> <date> envelope line is extracted but must not be assumed correct. libgit2 generates this ID from the commit object, but if the original commit was GPG-signed, or if the patch was generated by a different tool, the reconstructed commit (applied via apply_to_tree + commit_create_buffer) will have a different OID. The commit nostr tag is the authoritative source for commit identity when present.

Structs§

PatchMetadata

Functions§

extract_description_from_patch
parse_mbox_patch