tagpath
Parse, lint, and search tag-based identifiers across languages and naming conventions.
Tag Path treats every identifier as a path through an ordered sequence of tags. auth0__user__validate, personName, PersonName, and person_name all decompose into canonical tag lists — enabling semantic equivalence across conventions.
Install
Quick Start
# Parse an identifier into tags
# name: person_name
# convention: snake_case
# tags: [person, name]
# canonical: person_name
# Auto-detects convention
# tags: [person, name]
# tags: [person, name]
# Namespace dimensions (__ separator)
# tags: [auth0, user, validate]
# dimension 0: [auth0]
# dimension 1: [user]
# dimension 2: [validate]
# role: validator
# JSON output
# Extract identifiers from source files
# Cross-language semantic search
# Lint against .naming.toml rules
# Initialize a .naming.toml
Features
- Convention detection — auto-detects snake_case, camelCase, PascalCase, kebab-case, UPPER_SNAKE_CASE, Ada_Case
- Semantic equivalence —
person_name=personName=PersonName→[person, name] - Role detection —
create_*(factory),use_*(hook),set_*(setter),is_*(predicate), etc. - Shape detection —
*_a(array),*_r(record),*_m(map),*$(signal) - Namespace dimensions —
__separates semantic dimensions - Mixed convention support — handles
createContext_auth(camelCase + underscore extension) - Language presets — 39 languages with per-context conventions
- Configurable —
.naming.tomlfor project-specific conventions - Composable configs —
extendsinherits from language presets with per-context overrides - Identifier extraction — extract identifiers from source files with regex or tree-sitter AST
- Semantic search — find identifiers across naming conventions by canonical tag matching
- Lint — validate naming conventions against
.naming.tomlrules - Tree-sitter integration — AST-aware extraction for 14 languages with context classification
Language Presets
| Language | Default Convention | Key Contexts |
|---|---|---|
| C | snake_case | _t suffix types, UPPER_SNAKE macros |
| C++ | snake_case | STL-style, PascalCase classes |
| C# | PascalCase | I prefix interfaces, camelCase locals |
| Clojure | kebab-case | *earmuffs*, :keywords, ? predicates, / namespaces |
| Common Lisp | kebab-case | *earmuffs*, +constants+, defun/defvar |
| Crystal | snake_case | PascalCase types, ?/! suffixes, Ruby-inspired |
| CSS | kebab-case | -- custom properties, BEM patterns |
| D | camelCase | PascalCase types, camelCase constants, opCall |
| Dart | camelCase | _ prefix private, camelCase constants, factory keyword |
| Elixir | snake_case | PascalCase modules, ?/! suffixes |
| Erlang | snake_case | PascalCase modules, atoms, is_ guards |
| F# | camelCase | PascalCase types/modules, ` |
| Gleam | snake_case | PascalCase types/constructors, labeled arguments |
| Go | camelCase | PascalCase exported, New{Name} factory |
| Haskell | camelCase | PascalCase types/modules, mk/un prefixes |
| Java | camelCase | PascalCase classes, get/set/is prefixes |
| JavaScript | camelCase | PascalCase classes, kebab-case files |
| Julia | snake_case | PascalCase types, ! mutating, Unicode identifiers |
| Kotlin | camelCase | PascalCase classes/objects |
| Lua | snake_case | PascalCase classes, __ metamethods |
| Nim | camelCase | PascalCase types, style-insensitive, new{Name} factory |
| Objective-C | camelCase | PascalCase classes, 2-3 letter prefixes (NS, UI) |
| OCaml | snake_case | PascalCase modules, ' type variables |
| Odin | snake_case | Ada_Case types, rich allocation patterns |
| Perl | snake_case | $/@/% sigils, :: packages, PascalCase classes |
| PHP | camelCase | $ prefix vars, PascalCase classes |
| Python | snake_case | PascalCase classes, __dunder__ |
| R | snake_case | dot.case legacy (is.na), <- replacement functions |
| Racket | kebab-case | define, ? predicates, ! mutation |
| Ruby | snake_case | PascalCase classes, ?/!/= suffixes |
| Rust | snake_case | PascalCase types/traits, ' lifetime prefix |
| Scala | camelCase | PascalCase constants, apply/unapply factories |
| Scheme | kebab-case | define, ? predicates, ! mutation, set! |
| Shell | snake_case | UPPER_SNAKE env vars |
| SQL | snake_case | UPPER_SNAKE keywords |
| Swift | camelCase | PascalCase types/protocols |
| TypeScript | camelCase | PascalCase types/interfaces |
| V | snake_case | PascalCase types, Go-inspired, C. interop prefix |
| Zig | camelCase | camelCase functions, snake_case variables, PascalCase types |
Configuration
Create a .naming.toml in your project root:
= 1
= "my-project"
= "snake_case"
= true
= true
[]
= "_"
= "__"
[]
= "create_{name}"
= "use_{name}"
= "set_{name}"
[]
= true
Composable Configs (extends)
Configs can extend language presets and override specific contexts:
# Extend a language preset
= 1
= "my-project"
= ["rust"]
[]
= "camelCase" # override function convention
The extends field accepts an array of preset names. Fields from the extending config override inherited values. Context-level overrides merge with the parent — only the specified fields are replaced.
Tree-sitter Integration
Tagpath uses tree-sitter for AST-aware identifier extraction in 14 languages:
| Language | Grammar Crate | Feature Flag |
|---|---|---|
| Rust | tree-sitter-rust | lang-rust |
| Python | tree-sitter-python | lang-python |
| JavaScript | tree-sitter-javascript | lang-javascript |
| TypeScript | tree-sitter-typescript | lang-typescript |
| TSX | tree-sitter-typescript | lang-typescript |
| Go | tree-sitter-go | lang-go |
| C | tree-sitter-c | lang-c |
| C++ | tree-sitter-cpp | lang-cpp |
| Java | tree-sitter-java | lang-java |
| Ruby | tree-sitter-ruby | lang-ruby |
| PHP | tree-sitter-php | lang-php |
| C# | tree-sitter-c-sharp | lang-csharp |
| Swift | tree-sitter-swift | lang-swift |
| Kotlin | tree-sitter-kotlin-ng | lang-kotlin |
All 14 languages are enabled by default. Disable grammars you don't need to reduce binary size:
# Install with only Rust and Python grammars
# Install without any tree-sitter (regex-only extraction)
All other supported languages (25 of 39 presets) use regex-based identifier extraction.
- Use
--astflag withtagpath extractto enable tree-sitter mode - AST extraction classifies identifiers by context (function, type, variable, etc.)
Roadmap
- Phase 1 ✅ — Parse, detect conventions, semantic equivalence
- Phase 2 ✅ — tree-sitter integration, lint command, extract identifiers, semantic search, composable configs
- Phase 3 (next) — Graph building, alias generation, prose conversion
License
MIT