codegraph 0.1.1

A fast, reliable, and flexible graph database optimized for storing and querying code relationships
Documentation

codegraph

Crates.io Documentation License

A fast, reliable, and flexible graph database optimized for storing and querying code relationships.

Mission

codegraph provides a fast, reliable, and flexible graph database optimized for storing and querying code relationships, enabling tool builders to focus on analysis logic rather than infrastructure.

Core Principles

๐Ÿ”Œ Parser Agnostic

"Bring your own parser, we'll handle the graph."

codegraph does NOT include built-in language parsers. You integrate your own parsers (tree-sitter, syn, swc, etc.), and we provide the storage and query infrastructure.

โšก Performance First

"Sub-100ms queries or it didn't happen."

  • Single node lookup: <1ms
  • Neighbor query: <10ms
  • Graph traversal (depth=5): <50ms
  • 100K node graphs are practical

๐Ÿงช Test-Driven Development

"If it's not tested, it's broken."

  • 115 tests with comprehensive coverage (39 lib + 70 integration/unit + 6 doctests)
  • Every public API is tested
  • Benchmarks ensure performance targets
  • 85% test coverage (983/1158 lines)

๐Ÿช„ Zero Magic

"Explicit over implicit, always."

  • No global state
  • No automatic file scanning
  • No convention-over-configuration
  • Explicit error handling (no panics in library code)
  • No unsafe code

๐Ÿ’พ Persistence is Primary

"Graphs outlive processes."

  • RocksDB backend for production
  • Crash-safe with write-ahead logging
  • Atomic batch operations
  • Memory backend for testing only

Quick Start

Add to your Cargo.toml:

[dependencies]
codegraph = "0.1"

Basic Usage

use codegraph::{CodeGraph, Node, NodeType, Edge, EdgeType};
use std::path::Path;

// Create a persistent graph
let mut graph = CodeGraph::open(Path::new("./my_project.graph"))?;

// Add a file node (explicit, no magic)
let file_id = graph.add_file(Path::new("src/main.rs"), "rust")?;

// Add a function node
let mut func_node = Node::new(NodeType::Function);
func_node.set_property("name", serde_json::json!("main"));
func_node.set_property("line", serde_json::json!(10));
let func_id = graph.add_node(func_node)?;

// Create a relationship (file contains function)
let edge = Edge::new(file_id, func_id, EdgeType::Contains);
graph.add_edge(edge)?;

// Query the graph
let neighbors = graph.get_neighbors(&file_id)?;
println!("File contains {} entities", neighbors.len());

Parser Integration Example

// Example with tree-sitter (you provide the parser)
use tree_sitter::{Parser, Language};

extern "C" { fn tree_sitter_rust() -> Language; }

let mut parser = Parser::new();
parser.set_language(unsafe { tree_sitter_rust() }).unwrap();

let source_code = std::fs::read_to_string("src/main.rs")?;
let tree = parser.parse(&source_code, None).unwrap();

// You extract entities from the AST
// codegraph stores the relationships
let mut graph = CodeGraph::open("./project.graph")?;
let file_id = graph.add_file("src/main.rs", "rust")?;

// Walk the tree and add nodes/edges as you see fit

Architecture

codegraph is organized in clear layers:

User Tools (parsers, analysis)
    โ†“
Code Helpers (convenience API)
    โ†“
Query Builder (fluent interface)
    โ†“
Core Graph (nodes, edges, algorithms)
    โ†“
Storage Backend (RocksDB, memory)

Each layer:

  • Has well-defined boundaries
  • Can be tested independently
  • Doesn't leak abstractions
  • Has minimal dependencies on upper layers

Features

  • Persistent Storage: Production-ready RocksDB backend
  • Type-Safe API: Rust's type system prevents common errors
  • Schema-less Properties: Flexible JSON properties on nodes and edges
  • Efficient Queries: O(1) neighbor lookups with adjacency indexing
  • Explicit Operations: No hidden behavior or magical conventions
  • Comprehensive Tests: 85% test coverage (983/1158 lines)
  • Zero Unsafe Code: Memory-safe by default

What We Are (and Aren't)

We Are โœ…

  • A graph database optimized for code relationships
  • A storage layer for tool builders
  • Language-agnostic
  • Production-ready

We Are Not โŒ

  • A parser (no AST extraction)
  • A semantic analyzer (no type inference)
  • An IDE integration (no LSP server)
  • A complete framework (you build the analysis logic)

Performance Targets

Operation Target Actual
Node lookup <1ms โœ… ~7ns (1000x better!)
Neighbor query <10ms โœ… ~410ns - 40ยตs
BFS traversal (depth=5) <50ms โœ… ~5ms
Batch insert (10K nodes) <500ms โœ… ~7ms
100K node + 500K edge load <5s โœ… ~3.3s

Development

Build

cargo build --release

Test

cargo test

Documentation

cargo doc --open

Code Quality

# Format code
cargo fmt

# Lint with clippy
cargo clippy -- -D warnings

# Check test coverage
cargo tarpaulin

# Run all CI checks locally (recommended before pushing)
./scripts/ci-checks.sh

Examples

See the examples/ directory for complete examples:

  • basic_usage.rs - Creating and querying a simple graph
  • call_graph.rs - Function call analysis with syn integration
  • dependency_tree.rs - File dependency and circular dependency analysis
  • impact_analysis.rs - Complex query patterns for impact analysis
  • visualize.rs - Exporting graphs to DOT, JSON, CSV, and RDF formats

API Overview

Core Operations

// Node operations
let node_id = graph.add_node(NodeType::Function, properties)?;
let node = graph.get_node(node_id)?;
graph.delete_node(node_id)?;

// Edge operations
let edge_id = graph.add_edge(source, target, EdgeType::Calls, properties)?;
let neighbors = graph.get_neighbors(node_id, Direction::Outgoing)?;

// Batch operations
graph.add_nodes_batch(nodes)?;
graph.add_edges_batch(edges)?;

Helper Functions

use codegraph::helpers;

// Code-specific operations
let file_id = helpers::add_file(&mut graph, "main.rs", "rust")?;
let func_id = helpers::add_function(&mut graph, file_id, "main", 10, 20)?;
helpers::add_call(&mut graph, func1_id, func2_id, Some(15))?;

// Relationship queries
let callers = helpers::get_callers(&graph, func_id)?;
let deps = helpers::get_file_dependencies(&graph, file_id)?;

Query Builder

// Fluent query interface
let results = graph.query()
    .node_type(NodeType::Function)
    .in_file("src/main.rs")
    .property("visibility", "public")
    .name_contains("test")
    .execute()?;

Graph Algorithms

// Transitive analysis
let all_deps = helpers::transitive_dependencies(&graph, file_id, Some(5))?;
let all_dependents = helpers::transitive_dependents(&graph, file_id, None)?;

// Call chains
let paths = helpers::call_chain(&graph, from_func, to_func, Some(10))?;

// Circular dependencies
let cycles = helpers::circular_deps(&graph)?;

Export Formats

use codegraph::export;

// Graphviz DOT
export::export_dot(&graph, &mut output)?;

// D3.js JSON
export::export_json(&graph, &mut output)?;

// CSV (nodes and edges)
export::export_csv_nodes(&graph, &mut output)?;
export::export_csv_edges(&graph, &mut output)?;

// RDF N-Triples
export::export_triples(&graph, &mut output)?;

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Before contributing, please:

  1. Follow TDD methodology
  2. Ensure all tests pass
  3. Run cargo fmt and cargo clippy

License

codegraph is licensed under the Apache License 2.0, which means:

โœ… You can:

  • Use in commercial products
  • Modify and distribute
  • Use in proprietary software

โœ… You must:

  • Include a copy of the license
  • Disclose significant changes (in a CHANGES file)
  • Include the patent grant notice

โœ… You can't:

  • Hold us liable
  • Claim we endorse your product

This is a truly open license. There's no "gotcha" later where we switch to GPL or a commercial model. Apache-2.0 is forever.

Code of Conduct

This project adheres to the Rust Code of Conduct. See CODE_OF_CONDUCT.md.

Versioning

This project follows Semantic Versioning:

  • v0.x: API may change between minor versions (with deprecation warnings)
  • v1.0+: Stability guaranteed, breaking changes only in major versions

Current version: 0.1.1 (Initial release + formatting fixes)

Support

Roadmap

v0.2-0.5 (Near-term)

  • Query language improvements
  • More export formats (GraphML, Cypher)
  • Performance optimizations
  • First-party parser helper crates

v0.6-0.9 (Medium-term)

  • Incremental updates
  • Change tracking
  • Statistics and metrics API
  • CLI tool

v1.0+ (Long-term)

  • Schema validation
  • Full-text search integration
  • Compression options
  • Distributed graphs (maybe)

Acknowledgments

This project draws inspiration from:

  • Rust Language governance model
  • SQLite's reliability principles
  • Redis project philosophy
  • Kubernetes governance structure

Built with โค๏ธ in Rust