Linguist
A Rust library for programming language detection, inspired by GitHub Linguist. Detects programming languages by file extension, filename, and content-based heuristics.
Features
- Zero configuration and setup required, just add the crate and call the detection functions
- Detect languages by exact filename match (e.g.,
Makefile,Dockerfile) - Detect languages by file extension (e.g.,
.rs,.py,.js) - Disambiguate between multiple languages using content heuristics
- Identify vendored/third-party files
Usage
Detect by Extension
use detect_language_by_extension;
let languages = detect_language_by_extension?;
assert_eq!;
Detect by Filename
use detect_language_by_filename;
let languages = detect_language_by_filename?;
assert_eq!;
Disambiguate by Content
use disambiguate;
let content = "#include <iostream>\nint main() {}";
let result = disambiguate?;
if let Some = result
Check if Vendored
use is_vendored;
assert!;
assert!;
Acknowledgments
Special thanks to @vcfxb for graciously donating the crates.io name "linguist" to this project!
This project is inspired by and uses language definitions from GitHub Linguist, maintained by GitHub and its contributors. The language definitions (definitions/languages.yml, definitions/heuristics.yml, definitions/vendor.yml) are derived from this project.