Rust Line Endings
| OS | Status |
|---|---|
| Ubuntu-latest | |
| macOS-latest | |
| Windows-latest |
A Rust crate to detect, normalize, and convert line endings across platforms, including support for character streams. Ensures consistent handling of LF, CRLF, and CR line endings in text processing.
Install
Usage
Split into Multiple Strings
Split a string into a vector of strings using the auto-detected line ending parsed from the string.
use LineEnding;
let crlf = split;
let cr = split;
let lf = split;
let expected = vec!;
assert_eq!;
assert_eq!;
assert_eq!;
Join Multiple Strings into a Single String
Join a vector of strings using the specified line ending.
use LineEnding;
let lines = vec!;
assert_eq!;
assert_eq!;
assert_eq!;
Change Line Ending Type
Apply a specific line ending type to an existing string.
use LineEnding;
let mixed_text = "first line\r\nsecond line\rthird line\nfourth line\n";
assert_eq!;
assert_eq!;
assert_eq!;
Auto-identify Line Ending Type
Detect the predominant line ending style used in the input string.
use LineEnding;
let crlf = "first line\r\nsecond line\r\nthird line";
let cr = "first line\rsecond line\rthird line";
let lf = "first line\nsecond line\nthird line";
assert_eq!;
assert_eq!;
assert_eq!;
Normalize
Convert all line endings in a string to LF (\n) for consistent processing.
use LineEnding;
let crlf = "first\r\nsecond\r\nthird";
let cr = "first\rsecond\rthird";
let lf = "first\nsecond\nthird";
assert_eq!;
assert_eq!;
assert_eq!;
Denormalize
Restore line endings in a string to the specified type.
use LineEnding;
let lf = "first\nsecond\nthird";
let crlf_restored = CRLF.denormalize;
let cr_restored = CR.denormalize;
let lf_restored = LF.denormalize;
assert_eq!;
assert_eq!;
assert_eq!;
Handling Mixed-Type Line Endings
When a string contains multiple types of line endings (LF, CRLF, and CR), the LineEnding::from method will detect the most frequent line ending type and return it as the dominant one. This ensures a consistent approach to mixed-line-ending detection.
use LineEnding;
let mixed_type = "line1\nline2\r\nline3\nline4\nline5\r\n";
assert_eq!; // `LF` is the most common
The detection algorithm works as follows:
- Counts occurrences of each line ending type (
LF,CRLF,CR). - Selects the most frequent one as the detected line ending.
- Defaults to
CRLFif all are equally present or if the input is empty.
Handling Character Streams
When processing text from a stream (for example, when reading from a file), you often work with a Peekable iterator over characters. Manually checking for a newline (such as '\n') isn’t enough to handle all platforms, because Windows uses a two‑character sequence (\r\n) and some older systems use just \r.
This crate provides a trait extension (via the PeekableLineEndingExt trait) that adds a consume_line_ending() method to a Peekable<Chars> iterator. This method automatically detects and consumes the full line break sequence (whether it’s LF, CR, or CRLF) from the stream.
The following example demonstrates how to split a character stream into lines without having to manually handle each line-ending case:
use ;
let text = "line1\r\nline2\nline3\rline4";
let mut it = text.chars.peekable;
let mut lines = Vecnew;
let mut current_line = Stringnew;
while it.peek.is_some
lines.push;
assert_eq!;
Edge Cases & Examples
Case 1: One Line Ending Type is Clearly Dominant
use LineEnding;
let mostly_crlf = "line1\r\nline2\r\nline3\nline4\r\nline5\r\n";
assert_eq!; // `CRLF` is the most common
let mostly_cr = "line1\rline2\rline3\nline4\rline5\r";
assert_eq!; // `CR` is the most common
Case 2: All Line Endings Appear Equally
If LF, CRLF, and CR all appear the same number of times, the function will return CRLF as a tie-breaker.
use LineEnding;
let equal_mixed = "line1\r\nline2\nline3\rline4\r\nline5\nline6\r";
assert_eq!; // `CRLF` > `CR` > `LF`
CRLF is chosen as a tie-breaker because it represents both CR and LF, making it the most inclusive option.
Case 3: Single Line Containing Multiple Line Endings
If a single line contains different line endings, the function still chooses the most frequent across the entire string.
use LineEnding;
let mixed_on_one_line = "line1\r\nline2\rline3\r\nline4\r\nline5\r";
assert_eq!; // `CRLF` appears the most overall
Case 4: Empty Input Defaults to CRLF
use LineEnding;
let empty_text = "";
assert_eq!; // Defaults to `CRLF`
Additional Mixed-Type Code Examples
Counting Mixed Types
Count occurrences of each line ending type in the given string.
use ;
// `LineEndingScores` is a hash map that associates each line ending type with
// its occurrence count.
let mostly_lf = "line1\nline2\r\nline3\rline4\nline5\nline6\n";
assert_eq!;
assert_eq!;
Split as a Specific Type
If you want to forcefully split by a certain type.
use ;
let mostly_lf = "line1\nline2\r\nline3\rline4\nline5\nline6\n";
let split_crlf = CRLF.split_with;
assert_eq!;
Escaped vs. Actual Line Endings
Rust treats \\n as a literal sequence rather than an actual newline. This behavior ensures that escaped sequences are not mistakenly interpreted as real line breaks.
For example:
use LineEnding;
let lf_with_escaped = "First\\nSecond\nThird";
let result = split;
assert_eq!; // Escaped `\\n` remains intact
let lf = "First\nSecond\nThird";
let result_actual = split;
assert_eq!; // Actual `\n` splits
License
Licensed under MIT. See LICENSE for details.