multiscreen-rs
A Rust implementation of the Multiscreen neural language model — training and inference — powered by Burn. CPU by default (Burn Flex), optional CUDA GPU support via feature flag. No built-in tokenizer — you encode/decode text with your own tokenizer and pass Vec<u32> token IDs directly.
Installation
[]
= "0.1"
CUDA GPU Support
[]
= { = "0.1", = ["cuda"] }
Quick Start — Training
use *;
Quick Start — Inference
use *;
Streaming (token by token, like ChatGPT)
use *;
Examples
The crate ships with two self-contained examples that use SentencePiece tokenization.
Train a Model
# Train 10M params, 10k steps
# Train 1M params, 500 steps (for quick testing)
# Train with your own data
Chat with a Trained Model
# Interactive mode (streaming output)
# One-shot prompt
Generate a Loss Plot
# Requires Python + matplotlib + numpy
Training CLI Options
| Option | Default | Description |
|---|---|---|
--train-dir |
examples/data |
Directory with tokenizer.model + .txt/.jsonl files |
--run-dir |
runs/my-model |
Output directory (checkpoints, reports, loss CSV) |
--budget |
10m |
Parameter budget: 1m, 5m, 10m, 50m, 100m |
--steps |
10000 |
Total optimizer steps |
--batch-size |
4 |
Batch size |
--seq-len |
128 |
Sequence length |
--lr |
0.0002 |
Learning rate |
--val-split |
0.1 |
Fraction of data for validation |
--log-interval |
100 |
Print loss every N steps |
--latency-tokens |
20 |
Tokens to generate for latency benchmark |
Training Reports
Every training run produces a complete report in --run-dir:
runs/my-model/
├── checkpoints/
│ ├── config.json # Model architecture config
│ ├── latest.mpk # Trained weights
│ └── latest.json # Run metadata
├── tokenizer.model # Copy of the tokenizer
├── loss.csv # Per-step loss values (step,loss)
├── report.json # Machine-readable full report
└── report.md # Human-readable training report
The report includes:
- Configuration — budget, parameter count, seq len, batch size, learning rate
- Training — duration, throughput (steps/s), final loss, best loss
- Validation — loss, perplexity, next-token accuracy
- Test — loss, perplexity, next-token accuracy
- Inference — average latency per token, total generation time
All files under runs/ are excluded from git via .gitignore.
Loss Plot
The loss CSV can be plotted with the bundled Python script:
This generates loss_plot.png in the same directory.
Evaluation Metrics
The model is automatically evaluated on held-out data after training:
| Metric | Description |
|---|---|
| Loss | Average cross-entropy loss |
| Perplexity | exp(loss) — lower is better |
| Accuracy | Fraction of tokens where argmax(logits) == target |
Data is split 80/10/10 (train/val/test) by default. Override with --val-split.
Device Selection
use *;
let device = cpu?; // CPU (always available)
let device = cuda?; // CUDA GPU 0 (requires "cuda" feature)
let device = auto_device?; // picks best available
Parameter Budgets
Choose a model size with ParameterBudget — presets range from 1M to 100M parameters:
Params1M // ~1.2M
Params5M // ~5.5M
Params10M // ~10.5M (default)
Params50M // ~52.1M
Params100M // ~104.6M
Bundled Data
examples/data/ contains everything needed to run the examples out of the box:
tokenizer.model— SentencePiece model (~280KB, 2778 vocab)sample_chat.txt— 15 English chat lines for quick testing
Contributing
Contributions are welcome! Keep patches focused, maintain the default CPU/Flex path, and run:
License
MIT · Copyright (c) 2026 multiscreen-rs contributors.