tree-sitter-language-pack 1.8.1

Core library for tree-sitter language pack - provides compiled parsers for 305 languages
Documentation

Rust

Pre-compiled tree-sitter grammars for 306 programming languages, with on-demand download and caching. Polyglot core that powers every binding in the kreuzberg.dev ecosystem.

Installation

cargo add tree-sitter-language-pack

Quick Start

use tree_sitter_language_pack::{get_language, get_parser};

let mut parser = get_parser("python").expect("language available");
let tree = parser.parse("def hello(): pass", None).unwrap();
println!("{}", tree.root_node().to_sexp());

Features

  • 306 languages — pre-compiled tree-sitter grammars covering every major programming language and many minor ones.
  • On-demand download + cache — parsers fetched at first use; subsequent runs hit the local cache.
  • Code intelligence — extract functions, classes, imports, exports, symbols, docstrings, and diagnostics with one API.
  • Syntax-aware chunking — semantic chunks for RAG/LLM pipelines.
  • Polyglot bindings — Rust core with native bindings for Python, TypeScript, Go, Java, C#, Ruby, PHP, Elixir, and WebAssembly via alef.

Documentation

Part of Kreuzberg.dev

  • Kreuzberg — document intelligence: text, tables, metadata from 91+ formats with optional OCR.
  • Kreuzberg Cloud — managed extraction API with SDKs, dashboards, and observability.
  • kreuzcrawl — web crawling and scraping with HTML→Markdown and headless-Chrome fallback.
  • html-to-markdown — fast, lossless HTML→Markdown engine.
  • liter-llm — universal LLM API client with native bindings for 14 languages and 143 providers.
  • alef — the polyglot binding generator that produces this README and all per-language bindings.
  • Discord — community, roadmap, announcements.

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

Join our Discord community for questions and discussion.

License

MIT -- see LICENSE for details.