# MD040 - Code blocks should have a language specified
Aliases: `fenced-code-language`
## What this rule does
Ensures code blocks (```) specify what programming language they contain. Optionally enforces consistent language labels and restricts which languages are allowed.
## Why this matters
- **Syntax highlighting**: Editors and renderers can color-code the syntax correctly
- **Clarity**: Readers immediately know what language they're looking at
- **Consistency**: Using the same label (e.g., always `bash` instead of mixing `sh`/`bash`/`zsh`) keeps documentation uniform
- **Tools**: Some tools use language hints for processing or validation
## Examples
### ✅ Correct
````markdown
```python
def hello():
print("Hello, world!")
```
```javascript
console.log("Hello, world!");
```
```bash
echo "Hello, world!"
```
````
### ❌ Incorrect
````markdown
```
def hello():
print("Hello, world!")
```
```
console.log("Hello, world!");
```
````
### 🔧 Fixed
````markdown
```text
def hello():
print("Hello, world!")
```
```text
console.log("Hello, world!");
```
````
> **Note**: The fix adds `text` as a default language hint when none is specified.
## Configuration
```toml
[MD040]
# Language label normalization mode
# - "disabled" (default): Only check for missing language
# - "consistent": Normalize to most prevalent alias per language
style = "disabled"
# Override preferred label for specific languages
# Keys are GitHub Linguist canonical names, values are your preferred alias
preferred-aliases = { Shell = "bash", JavaScript = "js" }
# Restrict which languages are allowed (empty = allow all)
# Uses GitHub Linguist canonical language names
allowed-languages = ["Python", "Shell", "JavaScript", "TypeScript", "JSON", "YAML"]
# Block specific languages (ignored if allowed-languages is non-empty)
disallowed-languages = ["Java", "C++"]
# Action for unknown language labels not in GitHub Linguist
# - "ignore" (default): Silently ignore unknown languages
# - "warn": Emit a warning for unknown languages
# - "error": Treat unknown languages as errors
unknown-language-action = "ignore"
```
### Consistent Mode
When `style = "consistent"`, the rule ensures all code blocks that refer to the same language use the same label. For example, if your document has:
````markdown
```bash
echo "one"
```
```sh
echo "two"
```
```bash
echo "three"
```
````
The rule will flag `sh` as inconsistent because `bash` is more prevalent (2 occurrences vs 1).
**With `--fix`**, inconsistent labels are automatically normalized to the most prevalent one:
````markdown
```bash
echo "one"
```
```bash
echo "two"
```
```bash
echo "three"
```
````
### Preferred Aliases
Use `preferred-aliases` to override which label is used regardless of prevalence:
```toml
[MD040]
style = "consistent"
preferred-aliases = { Shell = "sh" } # Always use "sh" instead of "bash"
```
### Language Restrictions
Restrict which languages can appear in your documentation:
```toml
[MD040]
# Only allow these languages
allowed-languages = ["Python", "Shell", "JSON"]
```
Or block specific languages:
```toml
[MD040]
# Block these languages (only works if allowed-languages is empty)
disallowed-languages = ["Java", "C++"]
```
### Unknown Languages
By default, language labels not recognized by GitHub Linguist are silently ignored. Use `unknown-language-action` to change this behavior:
```toml
[MD040]
# Warn about unknown languages
unknown-language-action = "warn"
# Or treat unknown languages as errors
unknown-language-action = "error"
```
This is useful for enforcing that all language labels are valid and will receive proper syntax highlighting on GitHub.
## Linguist Integration
This rule uses [GitHub Linguist](https://github.com/github-linguist/linguist) as the source of truth for language names and aliases. This ensures compatibility with GitHub's syntax highlighting.
Common language mappings:
- `sh`, `bash`, `zsh`, `shell-script` → Shell
- `js`, `node` → JavaScript
- `ts` → TypeScript
- `python`, `python3` → Python
## Automatic fixes
- Missing language: Adds `text` as the default
- Inconsistent labels (when `style = "consistent"`): Normalizes to the preferred/prevalent label
## Learn more
- [CommonMark fenced code blocks](https://spec.commonmark.org/0.31.2/#fenced-code-blocks) - Technical specification
- [GitHub Flavored Markdown](https://github.github.com/gfm/#info-string) - Language hints in code blocks
- [GitHub Linguist](https://github.com/github-linguist/linguist) - Language detection and aliases
## Related rules
- [MD046](md046.md) - Code block style should be consistent
- [MD048](md048.md) - Code fence style should be consistent
- [MD031](md031.md) - Code blocks should be surrounded by blank lines