pub struct Span {
pub start: usize,
pub end: usize,
}Expand description
A span in the source string.
Keeps track of the offset between code units from the start of the source string, e.g. if=abc would be
stored as 0-2, 2-3, 3-6:
i f = a b c
├─────┤ ├─┤ ├─────────┤
^ ^ ^ ^ ^ ^ ^
0 1 2 3 4 5 6A code unit is a basic building block used by the unicode encoding system. It is effectively the primitive type
that is used to represent entire unicode code points. utf-8 uses u8 as the code unit, utf-16 uses
u16 as the code unit, and utf-32 uses char (u32).
The offsets depend on which code unit is being counted, which in turn depends on the called parsing function.
By default, the spans count utf-32 code units, i.e. rust chars, however, there are alternate functions
that count utf-16 or utf-8 code units:
lexer::parse_with_utf_16_offsets(),lexer::parse_with_utf_16_offsets_and_version(),lexer::parse_with_utf_8_offsets(),lexer::parse_with_utf_8_offsets_and_version().
The difference is only noticeable with non-ascii characters. Take the following string: a𐐀c. It has the
following representations in the different encodings (in decimal):
a (U+0061) 𐐀 (U+10400) c (U+0063)
utf-8: [97, 0, 0, 0] [240, 144, 144, 128] [99, 0, 0, 0]
utf-16: [97, 0] [55297, 56320] [99, 0]
utf-32: [97] [66560] [99]As you can see, both a and c use a single code point for all encodings; (a u32 or u16 takes up more
memory than a u8, but we aren’t concerned with that). However, the 𐐀 takes 4 code units in utf-8, but
only two code units in utf-16, and only one code unit in utf-32. This is why the spans would differ:
a 𐐀 c
^ ^ ^ ^
utf-8 0 1 5 6
utf-16 0 1 3 4
utf-32 0 1 2 3Why is this nuance necessary? Because the Language Server Protocol operates on utf-16 offsets by default, so
support for that encoding was mandatory when authoring this crate.
§Invariants
If this type is manually constructed or modified, the end position must be equal-to or greater than the
start position. If this invariant is not upheld then interacting with this span, for example to resolve the
cursor position, will result in logical bugs and incorrect behaviour but it will never cause a panic or memory
unsafety.
Fields§
§start: usize§end: usizeImplementations§
Source§impl Span
impl Span
Sourcepub fn new_between(a: Span, b: Span) -> Self
pub fn new_between(a: Span, b: Span) -> Self
Constructs a new span between the end of the first span and the beginning of the second span.
Sourcepub fn new_zero_width(position: usize) -> Self
pub fn new_zero_width(position: usize) -> Self
Constructs a zero-width span at the position.
Sourcepub fn is_zero_width(&self) -> bool
pub fn is_zero_width(&self) -> bool
Returns whether this span is zero-width.
Sourcepub fn is_before(&self, span: &Self) -> bool
pub fn is_before(&self, span: &Self) -> bool
Returns whether this span is located before the other span.
Sourcepub fn is_after(&self, span: &Self) -> bool
pub fn is_after(&self, span: &Self) -> bool
Returns whether this span is located after the other span.
Sourcepub fn is_before_pos(&self, pos: usize) -> bool
pub fn is_before_pos(&self, pos: usize) -> bool
Returns whether this span is located before the position.
Sourcepub fn is_after_pos(&self, pos: usize) -> bool
pub fn is_after_pos(&self, pos: usize) -> bool
Returns whether this span is located before the position.
Sourcepub fn starts_at_or_after(&self, position: usize) -> bool
pub fn starts_at_or_after(&self, position: usize) -> bool
Returns whether the beginning of this span is located at or after the specified position.
Sourcepub fn contains_pos(&self, position: usize) -> bool
pub fn contains_pos(&self, position: usize) -> bool
Returns whether a position lies within this span.
Sourcepub fn end_at_previous(self) -> Self
pub fn end_at_previous(self) -> Self
Returns a new span which starts at the same position but ends at a previous position, i.e. end: span.end - 1.
Sourcepub fn first_char(self) -> Self
pub fn first_char(self) -> Self
Returns a new span over the first character of this span.
If this span is zero-width, the new span will be identical.
Sourcepub fn last_char(self) -> Self
pub fn last_char(self) -> Self
Returns a new span over the last character of this span.
If this span is zero-width, the new span will be identical.
Sourcepub fn start_zero_width(self) -> Self
pub fn start_zero_width(self) -> Self
Returns a new zero-width span located at the start of this span.
Sourcepub fn end_zero_width(self) -> Self
pub fn end_zero_width(self) -> Self
Returns a new zero-width span located at the end of this span.
Sourcepub fn next_single_width(self) -> Self
pub fn next_single_width(self) -> Self
Returns a new span one width long, beginning at the end of this span.
Examples of how vscode will squiggle this:
// \ is the beginning of the newline char
return\
^^
return \
^^
return)
^^Sourcepub fn previous_single_width(self) -> Self
pub fn previous_single_width(self) -> Self
Returns a new span one width long, ending at the beginning of this span.