Gossiphs = Gossip Graphs
An experimental Rust library for general code file relationship analysis. Based on tree-sitter and git analysis.
Goal & Motivation
Code navigation is a fascinating subject that plays a pivotal role in various domains, such as:
- Guiding the context during the development process within an IDE.
- Facilitating more convenient code browsing on websites.
- Analyzing the impact of code changes in Continuous Integration (CI) systems.
- ...
In the past, I endeavored to apply LSP/LSIF technologies and techniques like Github's Stack-Graphs to impact analysis, encountering different challenges along the way. For our needs, a method akin to Stack-Graphs aligns most closely with our expectations. However, the challenges are evident: it requires crafting highly language-specific rules, which is a considerable investment for us, given that we do not require such high precision data.
We attempt to make some trade-offs on the challenges currently faced by stack-graphs to achieve our expected goals to a certain extent:
- Zero repo-specific configuration: It can be applied to most languages and repositories without additional configuration.
- Low extension cost: adding rules for languages is not high.
- Acceptable precision: We have sacrificed a certain level of precision, but we also hope that it remains at an acceptable level.
How it works
Gossiphs constructs a graph that interconnects symbols of definitions and references.
- Extract imports and exports: Identify the imports and exports of each file.
- Connect nodes: Establish connections between potential definition and reference nodes.
- Refine edges with commit histories: Utilize commit histories to refine the relationships between nodes.
Unlike stack-graphs, we have omitted the highly complex scope analysis and instead opted to refine our edges using commit histories. This approach significantly reduces the complexity of rule writing, as the rules only need to specify which types of symbols should be exported or imported for each file.
While there is undoubtedly a trade-off in precision, the benefits are clear:
- Minimal impact on accuracy: In practical scenarios, the loss of precision is not as significant as one might expect.
- Commit history relevance: The use of commit history to reflect the influence between code segments aligns well with our objectives.
- Language support: We can easily support the vast majority of programming languages, meeting the analysis needs of various types of repositories.
Usage
The project is still in the experimental stage.
As a command line tool
You can find pre-compiled files for your platform
on Our Release Page. After extraction, you can use gossiphs --help
to find the corresponding help.
For example, you can use this command to generate an obsidian vault:
and get a code relation graph:
As a rust library
Please refer to examples for usage.
As a local server
Starting a local server similar to LSP for other clients to use may be a reasonable approach, which is what we are currently doing.
API desc can be found here.
Precision
The method we use to demonstrate accuracy is to compare the results with those of LSP/LSIF. It must be admitted that static inference is almost impossible to obtain all reference relationships like LSP, but in strict mode, our calculation accuracy is still quite considerable. In normal mode, you can decide whether to adopt the relationship based on the weight returned.
| Repo | Precision (Strict Mode) | Graph Generated Time |
|---|---|---|
| https://github.com/williamfzc/srctx | 80/80 = 100 % | 83.139791ms |
| https://github.com/gin-gonic/gin | 160/167 = 95.80838 % | 310.6805ms |
Contribution
The project is still in a very early and experimental stage. If you are interested, please leave your thoughts through an issue. In the short term, we hope to add support for more languages, which is not too complicated.