pub fn nullable_msg_list(input: &[u8]) -> IResult<&[u8], MessageIDList<'_>>Expand description
A very lenient parser for lists of msg_id as used by In-Reply-To and References
The RFC definition is:
in-reply-to = 1*msg-id
obs-in-reply-to = *(phrase / msg-id)In the obs- syntax, the phrase tokens must be ignored.
However, historical emails seem to contain a lot of nonsense in between msg-id, and a lot of it is not part of the “phrase” syntax. We implement a more lenient parser that skips “everything” in-between msg-ids: quoted strings, encoded words (both part of the phrase syntax), and as a last resort, any bytes until encountering something that could be the start of one of the more “structured” tokens (msg-id, encoded word, quoted string).
Additionally, we try to recover from broken msg-ids: after reading a ‘<’, if we can’t parse a valid msg-id, we skip to the next ‘>’ and continue parsing.