tree-sitter-language-pack 1.9.0

Core library for tree-sitter language pack - provides compiled parsers for 306 languages
docs.rs failed to build tree-sitter-language-pack-1.9.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build: tree-sitter-language-pack-1.9.0-rc.6

Rust

<!-- Project Info -->
<a href="https://github.com/kreuzberg-dev/tree-sitter-language-pack/blob/main/LICENSE">
	<img src="https://img.shields.io/badge/License-MIT-007ec6" alt="License" />
</a>
<a href="https://docs.tree-sitter-language-pack.kreuzberg.dev">
	<img src="https://img.shields.io/badge/Docs-tree--sitter--language--pack-007ec6" alt="Documentation" />
</a>

Pre-compiled tree-sitter grammars for 306 programming languages, with on-demand download and caching. Polyglot core that powers every binding in the kreuzberg.dev ecosystem.

What This Package Provides

  • Parser access — load a tree-sitter language parser by name without wiring individual grammar crates or packages.
  • Code intelligence primitives — parse trees, functions, classes, imports, exports, symbols, docstrings, diagnostics, and syntax-aware chunks.
  • Shared cache model — parsers are fetched and cached once, then reused by every call in the process.
  • Same catalog as every binding — Rust, Python, Node.js, Go, Java, PHP, Ruby, .NET, Elixir, WASM, Dart, Kotlin Android, Swift, Zig, and C FFI use the same grammar set.
  • Rust crate — canonical API used by the other bindings and by Kreuzberg code intelligence.

Installation

cargo add tree-sitter-language-pack

Quick Start

use tree_sitter_language_pack::{get_language, get_parser};

let mut parser = get_parser("python").expect("language available");
let tree = parser.parse("def hello(): pass", None).unwrap();
println!("{}", tree.root_node().to_sexp());

Features

  • 300+ languages — pre-compiled tree-sitter grammars covering every major programming language and many minor ones.
  • On-demand download + cache — parsers fetched at first use; subsequent runs hit the local cache.
  • Code intelligence — extract functions, classes, imports, exports, symbols, docstrings, and diagnostics with one API.
  • Syntax-aware chunking — semantic chunks for RAG/LLM pipelines.
  • Polyglot bindings — Rust core with native bindings for Python, TypeScript, Go, Java, C#, Ruby, PHP, Elixir, and WebAssembly via alef.

Documentation

Part of Kreuzberg.dev

  • Kreuzberg — document intelligence: text, tables, metadata from 90+ formats with optional OCR.
  • Kreuzberg Cloud — managed extraction API with SDKs, dashboards, and observability.
  • kreuzcrawl — web crawling and scraping with HTML→Markdown and headless-Chrome fallback.
  • html-to-markdown — fast, lossless HTML→Markdown engine.
  • liter-llm — universal LLM API client with native bindings for 14 languages and 143 providers.
  • alef — the polyglot binding generator that produces this README and all per-language bindings.
  • Discord — community, roadmap, announcements.

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

Join our Discord community for questions and discussion.

License

MIT -- see LICENSE for details.