Expand description
Git commit encoding labels (encoding header, i18n.commitEncoding) mapped to codecs.
Git’s ISO-8859-1 is strict Latin-1; encoding_rs maps that label to Windows-1252, so we
handle Latin-1 separately.
Functions§
- commit_
message_ unicode_ for_ display - Unicode commit message body for display (for example,
format-patch). - decode_
bytes - Decode
bytesusing Git’s encoding name, or lossy UTF-8 if unknown. - decode_
rfc2047_ mailbox_ from_ line - Decode
=?charset?q?...?=encoded-words in an email display name (before<). - encode_
header_ text - Encode a single header field (author/committer line) without adding a trailing newline.
- encode_
unicode - Encode
unicodefor storage in a commit message body using Git’s encoding name. - ensure_
body_ trailing_ newline - Git stores the commit message body with a trailing newline when non-empty.
- finalize_
stored_ commit_ message - Prepare a commit message for storage per
i18n.commitEncoding(or equivalent). - find_
invalid_ utf8 - Find the offset of the first byte that is not part of a strictly valid UTF-8
sequence, mirroring Git’s
find_invalid_utf8(commit.c). - identity_
raw_ for_ serialized_ commit - Raw
author/committerheader payloads for a new commit object. - is_
known_ encoding - Whether
labelnames an encoding Git can decode (ISO-8859-1 or any encoding resolvable viaresolve). Unknown names (e.g. the test’snon-utf-8) return false, matching Git’slogmsg_reencodeno-op fallback. - is_
strict_ utf8 - Whether
bufis strictly valid UTF-8 per Git’s rules (seefind_invalid_utf8). - reencode_
utf8_ to_ label - Re-encode
unicodefrom UTF-8 intooutput_label, orNoneif unsupported. - resolve
- Resolve an encoding label the way Git uses it in config and commit objects.