substring-replace: Extract, insert and replace substrings
This crate adds a set of convenient methods to easily extract, insert and replace string slices in Rust with character indices compatibile with multibyte characters.
Do not add this library to your project if it already depends on the substring crate. Its core substring method will conflict with the same method in the SubstringReplace
trait. The like-named modules share the same signature and functionality, although this crate avoids unsafes block and will not panic if the start and end indices are out of range.
Regular Rust prefers str
slices for extracting string by index ranges. However, it will panic when indices are out of range and works with byte indices rather than the more intuitive character indices as used with the Regex crate.
substring
Returns a substring between start and end character indices. These indices differ from byte indices with multibyte characters in the extended Latin-script, most non-Latin alphabets, many special symbols and emojis.
let sample_str = "/long/file/path";
let result = sample_str.substring;
// the result is "file"
substring_start
This will return the start of a string (str
or string
) until the specified end character index.
let sample_str = "/long/file/path";
let result = sample_str.substring_start;
// the result is "/long"
substring_end
This method returns the end of a string (&str
or string
) from the specified start character index.
let sample_str = "/long/file/path";
let result = sample_str.substring_start;
// the result is "/file/path"
substring_replace
This method removes characters between the specified start and end indices and inserts a replacement string
let new_string = "azdefgh".substring_replace;
println!;
// will print "abcdefgh"
substring_replace_start
This method replaces the start of a string to a specified end character index
// remove the first 2 characters and prepend the string "xyz"
let new_string = "abcdefgh".substring_replace_start;
println!;
// will print "xyzcdefgh"
substring_replace_end
This method replaces the remainder of string from a specified start character index
// remove all characters after and index of 3 and append the string "xyz"
let new_string = "abcdefgh".substring_replace_end;
println!;
// will print "abcxyz"
substring_remove
This method returns the remainder after removing a substring delimited by start and end character indices. It's the oposite to substring(start, end).
let sample_str = "abcdefghij";
let result = sample_str.substring_remove;
// result will be "abcfghij"
substring_offset
This method extracts a substring from a start index for n characters to the right or left. A negative length in the second parameter will end at the reference index.
let sample_str = "indian-elephant";
let result = sample_str.substring_offset;
// result will be "ele"
substring_pull
This method returns the remainder after removing a substring from a start index for n characters to the right or left. It's the oposite to substring_offset(position, length). As with substring_offset, a negative length in the second parameter will will end at the reference index.
let sample_str = "indian-elephant";
let result = sample_str.substring_offset;
// result will be "ele"
let result = sample_str.substring_offset;
// result will be "ian"
substring_insert
This method inserts a string at a given character index and differs from the standard String::insert
method by using character rather than byte indices to work better with multibyte characters. It also works directly with &str
, but returns a new owned string.
let sample_str = "a/c";
let result = sample_str.substring_insert;
// result will be "a/b/c"
to_start_byte_index
This convert a start character index to a start byte index. It's mainly used internally.
It differs only from the to_end_byte_index
in its default value of 0 if it overflows.
let byte_index = "नमस्ते".to_start_byte_index;
// yields byte index of at the start of third multibyte character (character index 2). It should be 6
to_end_byte_index
This method converts an end character index to an end byte index. It's mainly used internally.
It differs only from the to_end_byte_index
in its default value at the end if it overflows.
char_len
This returns the character length in terms of individual unicode symbols as opposed to byte length with str::len()
.
This is shorthand for &str::char_indices().count()
.
let emoji = "😎";
println!;
// prints: Emoji length: 1, emoji byte length: 4
NB: This is an alpha release, but the crate is feature-complete and supplements string-patterns and simple-string-patterns .
Version history
1.3: Added new methods .substring_remove(start: usize, end: usize)
and .substring_pull(position: usize, length: i32)
.