1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175
//* # UTF8 Slice //* A lightweight heapless way to do slicing on unicode strings in Rust. //* //* # What does the library provide //* This library provides 4 utility functions to deal with unicode slices. //* //* ## `utf8_slice::slice(s: &str, begin: usize, end: usize) -> &str` //* This will do the same as `&s[begin..end]`, but now taking into account utf8 characters. //* //* ## `utf8_slice::from(s: &str, begin: usize) -> &str` //* This will do the same as `&s[begin..]`, but now taking into account utf8 characters. //* //* ## `utf8_slice::till(s: &str, end: usize) -> &str` //* This will do the same as `&s[..end]`, but now taking into account utf8 characters. //* //* ## `utf8_slice::len(s: &str) -> usize` //* This will do the same as `s.len()`, but now taking into account utf8 characters. //* # License //* MIT //* //* # Examples //* //* ``` //* let s = "The 🚀 goes to the 🌑!"; //* //* let rocket = utf8_slice::slice(s, 4, 5); //* # assert_eq!(utf8_slice::slice(s, 4, 5), "🚀"); //* // Will equal "🚀" //* ``` /// Fetches a slice of a string from a begin to an end index /// taking into account utf8/unicode character indices. /// /// # Arguments /// /// * `s` - An input string to take the slice from /// * `begin` - Where the slice begins /// * `end` - Where the slice ends /// /// # Examples /// /// ``` /// let s = "The 🚀 goes to the 🌑!"; /// /// let rocket = utf8_slice::slice(s, 4, 5); /// # assert_eq!(utf8_slice::slice(s, 4, 5), "🚀"); /// // Will equal "🚀" /// ``` /// /// # Note /// * Will return an empty string for invalid indices * pub fn slice(s: &str, begin: usize, end: usize) -> &str { if end < begin { return ""; } s.char_indices() .nth(begin) .and_then(|(start_pos, _)| { if end >= len(s) { return Some(&s[start_pos..]); } s[start_pos..] .char_indices() .nth(end - begin) .map(|(end_pos, _)| &s[start_pos..start_pos + end_pos]) }) .unwrap_or("") } /// Fetches a slice of a string from a starting index /// taking into account utf8/unicode character indices. /// /// # Arguments /// /// * `s` - An input string to take the slice from /// * `begin` - Where the slice begins /// /// # Examples /// /// ``` /// let s = "The 🚀 goes to the 🌑!"; /// /// let rocket_goes_to_the_moon = utf8_slice::from(s, 4); /// # assert_eq!(utf8_slice::from(s, 4), "🚀 goes to the 🌑!"); /// // Will equal "🚀 goes to the 🌑!" /// ``` /// /// # Note /// * Will return an empty string for invalid indices * pub fn from(s: &str, begin: usize) -> &str { slice(s, begin, len(s)) } /// Fetches a slice of a string until an ending index /// taking into account utf8/unicode character indices. /// /// # Arguments /// /// * `s` - An input string to take the slice from /// * `end` - Where the slice ends /// /// # Examples /// /// ``` /// let s = "The 🚀 goes to the 🌑!"; /// /// let the_rocket = utf8_slice::till(s, 5); /// # assert_eq!(utf8_slice::till(s, 4), "The 🚀"); /// // Will equal "The 🚀" /// ``` /// /// # Note /// * Will return an empty string for invalid indices * pub fn till(s: &str, end: usize) -> &str { slice(s, 0, end) } /// Fetches the length in characters of an utf8/unicode string /// /// # Arguments /// /// * `s` - The string of which to fetch the length pub fn len(s: &str) -> usize { s.chars().count() } #[cfg(test)] mod tests { use super::*; #[test] fn test_same_as_std_slice() { let s = "xjfdlskfaj sdfjlkj"; for i in 0..s.len() { for j in i..s.len() + 1 { assert_eq!(&s[i..j], slice(s, i, j)); } } } #[test] fn test_slice() { assert_eq!(slice("\u{345}ab\u{898}xyz", 1, 4), "ab\u{898}"); assert_eq!(slice("\u{345}ab\u{898}xyz", 0, 4), "\u{345}ab\u{898}"); assert_eq!(slice("\u{345}ab\u{898}xyz", 5, 4), ""); assert_eq!(slice("\u{345}ab \u{898}xyz", 0, 1), "\u{345}"); assert_eq!(slice("abcdef", 0, 6), "abcdef"); assert_eq!(slice("\u{345}ab\u{898}xyz", 1, 7), "ab\u{898}xyz"); } #[test] fn test_from() { assert_eq!(from("\u{345}ab\u{898}xyz", 1), "ab\u{898}xyz"); assert_eq!(from("\u{345}ab\u{898}xyz", 3), "\u{898}xyz"); assert_eq!(from("\u{345}ab\u{898}xyz", 10), ""); assert_eq!(from("\u{345}ab \u{898}xyz", 0), "\u{345}ab \u{898}xyz"); } #[test] fn test_till() { assert_eq!(till("\u{345}ab\u{898}xyz", 1), "\u{345}"); assert_eq!(till("\u{345}ab\u{898}xyz", 3), "\u{345}ab"); assert_eq!(till("\u{345}ab\u{898}xyz", 0), ""); } #[test] fn test_len() { assert_eq!(len(""), 0); assert_eq!(len("👨🚀"), 3); assert_eq!(len("abc"), 3); assert_eq!(len("abd👨🚀"), 6); } }