pub struct SourceMap<'src> { /* private fields */ }Expand description
Provides utilities for mapping ByteSpans to SourceSpans,
byte offsets to SourcePositions, and extracting content from the
original source text.
The 'src lifetime ties the SourceMap to the source text it was built
from.
SourceMap is a key part of what makes libgraphql-parser fast.
The lexer and parser operate exclusively on compact u32 byte
offsets (stored in ByteSpans), deferring line/column
computation until it is actually needed — typically only for
error formatting or tooling & IDE features. SourceMap is the mechanism
that makes this deferred resolution possible, translating raw
byte offsets back to human-readable SourcePositions on
demand.
Construction of SourceMaps is typically owned by a
GraphQLTokenSource and uses one of two
modes of data-fill: SourceMap::new_with_source or
SourceMap::new_precomputed.
§Construction with source text (SourceMap::new_with_source)
An O(sourcelen) (SIMD-accelerated) pre-pass scans the source string for line
terminators (\n, \r, \r\n) and records the byte offset of each line
start. Individual position lookups via SourceMap::resolve_offset later
are then O(log n) binary searches on the line-start table, plus a short
char-counting walk from line start to target byte offset to compute UTF-8
and UTF-16 column values.
This mode is used by
StrGraphQLTokenSource,
which has direct access to the source string. It is memory-efficient
(only one u32 per line) and avoids as much per-token bookkeeping as
possible during lexing — the lexer only tracks a single curr_byte_offset
and defers all line/column computation.
§Pre-Computed Columns Mode (SourceMap::new_precomputed)
Some token sources do not have access to the underlying source text at
resolution time. For example,
libgraphql_macros::RustMacroGraphQLTokenSource
produces tokens from a proc_macro2::TokenStream. Each proc_macro2::Span
carries line/column information at the time the token is produced, but there
is no contiguous source &str to scan after the fact. In this mode, the
token source collects (byte_offset, SourcePosition) entries during lexing
and then constructs the SourceMap at the end by passing the entries to
new_precomputed(). Later, lookups will
binary-search that entries table.
This mode uses more memory (one full SourcePosition per inserted
offset, rather than one u32 per line), but lookups are O(log n) with
no char-counting walk — just a binary search and a direct return.
When a SourceMap is constructed in pre-computed mode, the 'src lifetime
is typically 'static since no source text is borrowed.
§UTF-16 Column Recovery
In source-text mode, UTF-16 columns are computed on demand by iterating
chars from the line start to the target byte offset and summing
char::len_utf16(). In pre-computed columns mode, UTF-16 columns are
whatever the token source provided (or None if the token source cannot
compute them).
Implementations§
Source§impl<'src> SourceMap<'src>
impl<'src> SourceMap<'src>
Sourcepub fn new_with_source(source: &'src str, file_path: Option<PathBuf>) -> Self
pub fn new_with_source(source: &'src str, file_path: Option<PathBuf>) -> Self
Builds a SourceMap in source-text mode by scanning source for
line terminators.
This is an O(n) pre-pass that identifies all line start byte offsets.
Line terminators recognized: \n, \r, \r\n (the pair counts as
one terminator).
Sourcepub fn new_precomputed(
entries: Vec<(u32, SourcePosition)>,
file_path: Option<PathBuf>,
) -> Self
pub fn new_precomputed( entries: Vec<(u32, SourcePosition)>, file_path: Option<PathBuf>, ) -> Self
Creates a SourceMap in pre-computed columns mode.
The entries parameter contains (byte_offset, SourcePosition)
pairs that were collected during lexing. Entries must be sorted
by byte offset in monotonically increasing order (which is
naturally the case when collected during a left-to-right lex
pass).
This mode is intended for token sources that know line/column information at lex time but do not have access to the underlying source text afterward.
See the type-level documentation for a detailed comparison of the two modes.
§Example
// During lexing, collect entries into a Vec:
let mut entries = Vec::new();
entries.push((byte_offset, position));
// After lexing, build the SourceMap:
let source_map = SourceMap::new_precomputed(entries, None);Sourcepub fn empty() -> Self
pub fn empty() -> Self
Creates an empty SourceMap that cannot resolve any offsets.
Useful for token sources that don’t have source text (e.g. proc-macro token sources or test mocks).
Sourcepub fn source(&self) -> Option<&'src str>
pub fn source(&self) -> Option<&'src str>
Returns the source text, if this is a source-text-mode SourceMap.
Sourcepub fn resolve_offset(&self, byte_offset: u32) -> Option<SourcePosition>
pub fn resolve_offset(&self, byte_offset: u32) -> Option<SourcePosition>
Resolves a byte offset to a full SourcePosition (line, col_utf8,
col_utf16, byte_offset).
Returns None if the offset cannot be resolved — for example, if
the byte offset is out of bounds (source-text mode) or if no
pre-computed entries cover the requested offset.
§Source-text mode
Uses binary search on line_starts to find the line, then counts
chars from the line start to compute columns.
§Pre-computed columns mode
Binary-searches the pre-computed entries for the largest byte offset
<= the requested offset (floor lookup). If the requested offset
falls between two entries, the earlier entry’s position is returned
(this handles lookups for byte offsets mid-token, returning the
position of the nearest preceding entry).
Sourcepub fn resolve_span(&self, span: ByteSpan) -> Option<SourceSpan>
pub fn resolve_span(&self, span: ByteSpan) -> Option<SourceSpan>
Resolves a ByteSpan to a full SourceSpan with
line/column information and file path.
Returns None if either endpoint of the span cannot be
resolved. Common scenarios where resolution fails:
- Empty
SourceMap(SourceMap::empty()): no entries exist, so no offset can be resolved. - Out-of-bounds offset: the byte offset exceeds the source text length (source-text mode) or falls before the first pre-computed entry (pre-computed columns mode).
- Mid-UTF-8 offset: the byte offset lands in the middle of a multi-byte UTF-8 character (source-text mode only).
For error reporting, the parser’s internal resolve_span()
wrapper falls back to SourceSpan::zero() when this method
returns None. For AST tooling that needs accurate
positions, callers should handle the None case explicitly.
Sourcepub fn get_line(&self, line_index: usize) -> Option<&'src str>
pub fn get_line(&self, line_index: usize) -> Option<&'src str>
Returns the content of the line at the given 0-based line index,
stripped of any trailing line terminator (\n, \r, \r\n).
Returns None if this is not a source-text-mode SourceMap, or if
line_index is out of bounds.
This uses the line_starts table built by
compute_line_starts(), which correctly recognizes bare
\r as a line terminator per the GraphQL spec. Code that needs to
extract line content should use this method rather than
str::lines(), which does not handle bare \r.
Note: graphql_parse_error::get_line() provides similar
functionality via a linear scan (no pre-computed table).
Both must use the same line-terminator semantics.
Sourcepub fn line_count(&self) -> usize
pub fn line_count(&self) -> usize
Returns the number of lines in the source text.
In source-text mode, this is the number of line-start entries computed during construction. In pre-computed columns mode, this is derived from the highest line number seen in the entries (plus one). Returns 0 if no entries have been inserted.