# Module: `aprender::text`
Public module of the `aprender-core` crate.
## Source
[`crates/aprender-core/src/text.rs`](https://github.com/paiml/aprender/blob/main/crates/aprender-core/src/text.rs) or directory.
## Example
```rust
use aprender::text::{Tokenizer, ChatTemplateEngine, ChatMessage};
// See `cargo doc -p aprender-core --open` for full API reference.
```
## Module summary
`aprender::text` is the NLP toolkit. It owns the `Tokenizer` trait, BPE and
Llama-style tokenizers, the chat-template engine that turns
`Vec<ChatMessage>` into prompt strings for ChatML / Llama2 / Mistral / Phi /
Alpaca, plus classical NLP utilities — stop words, stemming, sentiment,
similarity, IDF, summarisation, vectorisation, topic modelling, and a small
RAG retrieval submodule. Anything that converts characters to tokens or
templates a conversation runs through here.
## Key types
| `Tokenizer` | Trait. `encode(&str) -> Vec<u32>`, `decode(&[u32]) -> String`. |
| `ChatTemplateEngine` | Multi-format chat-template renderer (minijinja under the hood). |
| `ChatMessage`, `SpecialTokens`, `TemplateFormat` | Building blocks for templating. |
| `ChatMLTemplate`, `Llama2Template`, `MistralTemplate`, `PhiTemplate`, `AlpacaTemplate`, `HuggingFaceTemplate`, `RawTemplate` | Concrete template implementations. |
| `auto_detect_template`, `detect_format_from_name`, `create_template` | Convenience constructors. |
| `text::bpe`, `text::llama_tokenizer`, `text::stem`, `text::similarity`, `text::vectorize`, `text::rag` | Sub-modules for specific tasks. |
## Usage patterns
### Pattern 1: Detect a chat template by model name
```rust
use aprender::text::{detect_format_from_name, TemplateFormat};
let fmt = detect_format_from_name("Qwen/Qwen2.5-Coder-7B-Instruct");
let other = detect_format_from_name("meta-llama/Llama-2-7b-chat-hf");
assert!(matches!(other, Some(TemplateFormat::Llama2)));
```
### Pattern 2: Render a multi-turn conversation
```rust
use aprender::text::{ChatMessage, ChatMLTemplate, ChatTemplateEngine};
let template = ChatMLTemplate::default();
let messages = vec![
ChatMessage::system("You are a helpful coding assistant."),
ChatMessage::user("What is 2 + 2?"),
ChatMessage::assistant("4"),
ChatMessage::user("And 3 + 3?"),
];
let prompt = template.render(&messages, true).expect("render");
println!("--- prompt ---\n{}", prompt);
assert!(prompt.contains("system"));
assert!(prompt.contains("user"));
```
## See also
- [`models`](models.md) — Qwen2 / BERT consume tokens produced here
- [`embed`](embed.md) — vectorisation / embedding pipelines built on top
- [`code`](code.md) — code-aware tokenisation and parsing
- [`stack`](stack.md) — higher-level orchestration that bundles text + model + template
## Full API
Run `cargo doc -p aprender-core --open` for the rendered rustdoc, or browse
[docs.rs/aprender](https://docs.rs/aprender) for the published version.