1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
//! Types and Traits for efficient String storage and deduplication.
//!
//! Because `cstree` is aimed at _concrete_ syntax trees that faithfully represent all of the original program input,
//! `cstree` aks for the text of each token when building a syntax tree. You'll notice this when looking at
//! [`GreenNodeBuilder::token`], which takes the kind of token and a refernce to the text of the token in the source.
//!
//! Of course, there are tokens whose text will always be the same, such as punctuation (like a semicolon), keywords
//! (like `fn`), or operators (like `<=`). Use [`Syntax::static_text`] when implementing `Syntax` to make `cstree`
//! aware of such tokens.
//!
//! There is, however, another category of tokens whose text will appear repeatedly, but for which we cannot know the
//! text upfront. Any variable, type, or method that is user-defined will likely be named more than once, but there is
//! no way to know beforehand what names a user will choose.
//!
//! In order to avoid storing the source text for these tokens many times over, `cstree` _interns_ the text of its
//! tokens (if that text is not static). What this means is that each unique string is only stored once. When a new
//! token is added - say, a variable -, we check if we already know its contents (the variable name). If the text is
//! new, we save it and give it a unique Id. If we have seen the text before, we look up its unique Id and don't need to
//! keep the new data around. As an additional benefit, interning also makes it much cheaper to copy source text around
//! and also to compare it with other source text, since what is actually being copied or compared is just an integer.
//!
//! ## I just want to build a syntax tree
//!
//! If you don't want to worry about this for now, you (mostly) can! All required functionality is implemented in
//! `cstree` and you can just use [`GreenNodeBuilder::new`] to obtain a tree builder with everything set up (see the
//! [crate documentation] for more on how to get started). This will create an interner, which the builder returns
//! together with the syntax tree on [`finish`] as part of its node cache (call [`NodeCache::into_interner`] on the
//! result to get the interner out).
//!
//! Here begins the part where you do have to think about interning: `cstree` needs the interner you get when you want
//! to look at the source text for some part of the syntax tree, so you'll have to keep it around somehow until the
//! point where you need it.
//!
//! How best to do this depends on what you need the text for. If the code that accesses the text is close-by, it might
//! be enough to pass the return value to the functions that need it (within `cstree` or in your code). Other options
//! could be to store the interner together with the syntax tree. If you use [`SyntaxNode::new_root_with_resolver`], you
//! get a syntax tree that can handle text without any need to manage and pass an interner (the reason the method is
//! called `_with_resolver` and not `_with_interner` is that it doesn't actually needs a full [`Interner`] -- once the
//! tree is created, no more text will be added, so it just needs to be able to look up text. This part is called a
//! [`Resolver`]). Or you could put the interner somewhere "global", where you can easily access it from anywhere.
//!
//! ## Using other interners
//!
//! By default, `cstree` uses its own, simple interner implementation. You can obtain an interner by calling
//! [`new_interner`], or bring your own by implementing the [`Resolver`] and [`Interner`] traits defined in this module.
//! Most methods in `cstree` require that you support interning [`TokenKey`]s. `TokenKey` implements [`InternKey`], so
//! your implementation can use that to convert to whatever types it uses for its internal representation. Note that
//! there is no way to change the size of the internal representation.
//!
//! ### `lasso`
//! Using features, you can enable support for some third-party interners. The primary one is [`lasso`], a crate focused
//! on efficient interning of text strings. This is enabled via the `lasso_compat` feature and adds the necessary trait
//! implementation to make `lasso`'s interners work with `cstree` (as well as a re-export of the matching version of
//! `lasso` here). If enabled, `cstree`'s built-in interning functionality is replaced with `lasso`'s more efficient one
//! transparently, so you'll now be returned a `lasso` interner from [`new_interner`].
//
// ### `salsa`
// If you are using the "2022" version of the `salsa` incremental query framework, it is possible to use its interning
// capabilities with `cstree` as well. Support for this is experimental, and you have to opt in via the
// `salsa_2022_compat` feature. For instructions on how to do this, and whether you actually want to, please refer to
// [the `salsa_compat` module documentation].
//!
//! [crate documentation]: crate
//! [`Syntax::static_text`]: crate::Syntax::static_text
//! [`GreenNodeBuilder::token`]: crate::build::GreenNodeBuilder::token
//! [`GreenNodeBuilder::new`]: crate::build::GreenNodeBuilder::new
//! [`finish`]: crate::build::GreenNodeBuilder::finish
//! [`NodeCache::into_interner`]: crate::build::NodeCache::into_interner
//! [`SyntaxNode::new_root_with_resolver`]: crate::syntax::SyntaxNode::new_root_with_resolver
//! [`lasso`]: lasso
// [the `salsa_compat` module documentation]: salsa_compat
pub use *;
pub use TokenInterner;
pub use TokenInterner;
pub use MultiThreadedTokenInterner;
pub use lasso;
use fmt;
use NonZeroU32;
/// The intern key type for the source text of [`GreenToken`s](crate::green::GreenToken).
/// Each unique key uniquely identifies a deduplicated, interned source string.
// Safety: we match `+ 1` and `- 1`, so it is always possible to round-trip.
unsafe
/// Constructs a new, single-threaded [`Interner`].
///
/// If you need the interner to be multi-threaded, see [`new_threaded_interner`].
/// Constructs a new [`Interner`] that can be used across multiple threads.
///
/// Note that you can use `&MultiThreadedTokenInterner` and `Arc<MultiThreadTokenInterner>` to access interning methods
/// through a shared reference, as well as construct new syntax trees. See [the module documentation](self) for more
/// information and examples.