Crate ropey[−][src]
Expand description
Ropey is a utf8 text rope for Rust. It is fast, robust, and can handle huge texts and memory-incoherent edits with ease.
Ropey’s atomic unit of text is Unicode scalar values (or char
s in Rust)
encoded as utf8. All of Ropey’s editing and slicing operations are done
in terms of char indices which prevents accidental creation of invalid
utf8 data.
The library is made up of four main components:
Rope
: the main rope type.RopeSlice
: an immutable view into part of aRope
.iter
: iterators overRope
/RopeSlice
data.RopeBuilder
: an efficient incrementalRope
builder.
A Basic Example
Let’s say we want to open up a text file, replace the 516th line (the writing was terrible!), and save it back to disk. It’s contrived, but will give a good sampling of the APIs and how they work together.
use std::fs::File; use std::io::{BufReader, BufWriter}; use ropey::Rope; // Load a text file. let mut text = Rope::from_reader( BufReader::new(File::open("my_great_book.txt")?) )?; // Print the 516th line (zero-indexed) to see the terrible // writing. println!("{}", text.line(515)); // Get the start/end char indices of the line. let start_idx = text.line_to_char(515); let end_idx = text.line_to_char(516); // Remove the line... text.remove(start_idx..end_idx); // ...and replace it with something better. text.insert(start_idx, "The flowers are... so... dunno.\n"); // Print the changes, along with the previous few lines for context. let start_idx = text.line_to_char(511); let end_idx = text.line_to_char(516); println!("{}", text.slice(start_idx..end_idx)); // Write the file back out to disk. text.write_to( BufWriter::new(File::create("my_great_book.txt")?) )?;
More examples can be found in the examples
directory of the git
repository. Many of those examples demonstrate doing non-trivial things
with Ropey such as grapheme handling, search-and-replace, and streaming
loading of non-utf8 text files.
Low-level APIs
Ropey also provides access to some of its low-level APIs, enabling client
code to efficiently work with a Rope
’s data and implement new
functionality. The most important of those API’s are:
- The
chunk_at_*()
chunk-fetching methods ofRope
andRopeSlice
. - The
Chunks
iterator. - The functions in
str_utils
for operating on&str
slices.
Internally, each Rope
stores text as a segemented collection of utf8
strings. The chunk-fetching methods and Chunks
iterator provide direct
access to those strings (or “chunks”) as &str
slices, allowing client
code to work directly with the underlying utf8 data.
The chunk-fetching methods and str_utils
functions are the basic
building blocks that Ropey itself uses to build much of its functionality.
For example, the Rope::byte_to_char()
method can be reimplemented as a free function like this:
use ropey::{ Rope, str_utils::byte_to_char_idx }; fn byte_to_char(rope: &Rope, byte_idx: usize) -> usize { let (chunk, b, c, _) = rope.chunk_at_byte(byte_idx); c + byte_to_char_idx(chunk, byte_idx - b) }
And this will be just as efficient as Ropey’s implementation.
The chunk-fetching methods in particular are among the fastest functions that Ropey provides, generally operating in the sub-hundred nanosecond range for medium-sized (~200kB) documents on recent-ish computer systems.
A Note About Line Endings
Some of Ropey’s APIs use the concept of line breaks or lines of text. In all such APIs, Ropey treats the following unicode sequences as line breaks:
U+000A
— LF (Line Feed)U+000B
— VT (Vertical Tab)U+000C
— FF (Form Feed)U+000D
— CR (Carriage Return)U+0085
— NEL (Next Line)U+2028
— Line SeparatorU+2029
— Paragraph SeparatorU+000D
U+000A
— CRLF (Carriage Return + Line Feed)
Additionally, Ropey treats line breaks as being a part of the line that
they mark the end of. That is to say, lines begin immediately after a
line break. For example, the text "Hello\nworld"
has two lines:
"Hello\n"
and "world"
. And the text "Hello\nworld\n"
has three
lines: "Hello\n"
, "world\n"
, and ""
.
CRLF pairs are always treated as a single line break, and are never split across chunks. Note, however, that slicing can still split them.
Modules
iter | Iterators over a |
str_utils | Utility functions for utf8 string slices. |
Structs
Rope | A utf8 text rope. |
RopeBuilder | An efficient incremental |
RopeSlice | An immutable view into part of a |
Enums
Error | Ropey’s error type. |
Type Definitions
Result | Ropey’s result type. |