pub trait TokenType:
Copy
+ Eq
+ Hash
+ Send
+ Sync
+ 'static
+ Debug {
type Role: TokenRole;
const END_OF_STREAM: Self;
// Required method
fn role(&self) -> Self::Role;
// Provided methods
fn is_role(&self, role: Self::Role) -> bool { ... }
fn is_universal(&self, role: UniversalTokenRole) -> bool { ... }
fn is_comment(&self) -> bool { ... }
fn is_whitespace(&self) -> bool { ... }
fn is_error(&self) -> bool { ... }
fn is_ignored(&self) -> bool { ... }
fn is_end_of_stream(&self) -> bool { ... }
}Expand description
Token type definitions for tokens in the parsing system.
This module provides the TokenType trait which serves as the foundation
for defining different types of tokens in the parsing system.
It enables categorization of token elements and provides methods for
identifying their roles in the language grammar.
§Universal Grammar Philosophy
The role mechanism in Oak is inspired by the concept of “Universal Grammar”. While every language has its own unique “Surface Structure” (its specific token kinds), most share a common “Deep Structure” (syntactic roles).
By mapping language-specific kinds to UniversalTokenRole, we enable generic tools
like highlighters and formatters to work across 100+ languages without deep
knowledge of each one’s specific grammar.
§Implementation Guidelines
When implementing this trait for a specific language:
- Use an enum with discriminant values for efficient matching
- Ensure all variants are Copy and Eq for performance
- Include an END_OF_STREAM variant to signal input termination
- Define a
Roleassociated type and implement therole()method to provide syntactic context.
§Examples
#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash)]
enum SimpleToken {
Identifier,
Number,
Plus,
EndOfStream,
}
impl TokenType for SimpleToken {
const END_OF_STREAM: Self = SimpleToken::EndOfStream;
type Role = UniversalTokenRole; // Or a custom Role type
fn role(&self) -> Self::Role {
match self {
SimpleToken::Identifier => UniversalTokenRole::Name,
SimpleToken::Number => UniversalTokenRole::Literal,
SimpleToken::Plus => UniversalTokenRole::Operator,
_ => UniversalTokenRole::None,
}
}
// ... other methods
}Required Associated Constants§
Sourceconst END_OF_STREAM: Self
const END_OF_STREAM: Self
A constant representing the end of the input stream.
This special token type is used to signal that there are no more tokens to process in the input. It’s essential for parsers to recognize when they’ve reached the end of the source code.
§Implementation Notes
This should be a specific variant of your token enum that represents the end-of-stream condition. It’s used throughout the parsing framework to handle boundary conditions and termination logic.
Required Associated Types§
Required Methods§
Provided Methods§
Sourcefn is_role(&self, role: Self::Role) -> bool
fn is_role(&self, role: Self::Role) -> bool
Returns true if this token matches the specified language-specific role.
Sourcefn is_universal(&self, role: UniversalTokenRole) -> bool
fn is_universal(&self, role: UniversalTokenRole) -> bool
Returns true if this token matches the specified universal role.
Sourcefn is_comment(&self) -> bool
fn is_comment(&self) -> bool
Returns true if this token represents a comment.
§Default Implementation
Based on UniversalTokenRole::Comment.
Sourcefn is_whitespace(&self) -> bool
fn is_whitespace(&self) -> bool
Returns true if this token represents whitespace.
§Default Implementation
Based on UniversalTokenRole::Whitespace.
Sourcefn is_error(&self) -> bool
fn is_error(&self) -> bool
Returns true if this token represents an error condition.
§Default Implementation
Based on UniversalTokenRole::Error.
Sourcefn is_ignored(&self) -> bool
fn is_ignored(&self) -> bool
Returns true if this token represents trivia (whitespace, comments, etc.).
Trivia tokens are typically ignored during parsing but preserved for formatting and tooling purposes. They don’t contribute to the syntactic structure of the language but are important for maintaining the original source code formatting.
§Default Implementation
The default implementation considers a token as trivia if it is either whitespace or a comment. Language implementations can override this method if they have additional trivia categories.
§Examples
// Skip over trivia tokens during parsing
while current_token.is_ignored() {
advance_to_next_token();
}Sourcefn is_end_of_stream(&self) -> bool
fn is_end_of_stream(&self) -> bool
Returns true if this token represents the end of the input stream.
This method provides a convenient way to check if a token is the special END_OF_STREAM token without directly comparing with the constant.
§Examples
// Loop until we reach the end of the input
while !current_token.is_end_of_stream() {
process_token(current_token);
current_token = next_token();
}Dyn Compatibility§
This trait is not dyn compatible.
In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.