tokstream
中文 | English
A token streaming simulator powered by Hugging Face tokenizers. It downloads a tokenizer from HF Hub and generates tokens at a target rate, with live stats for target vs actual throughput.
Highlights
- Rust CLI with high‑precision pacing (sleep + spin)
- Web demo (WASM) and npx executable
- Random English / Chinese generation and text replay
- Configurable filtering strategy
- Target vs actual tokens/sec stats
- Workspace layout with reusable core
Project Layout
.
├── crates
│ ├── tokstream-core # tokenizer engine
│ ├── tokstream-cli # Rust CLI
│ └── tokstream-wasm # wasm-bindgen bindings
├── npm # npx CLI + web demo
├── bin # npm bin entry
├── Cargo.toml # workspace
├── justfile
├── package.json
├── README.md
└── README_ZH.md
Rust CLI
Quick Start
Install from crates.io
# or
Notes:
- The binary name is
tokstreamafter installation. cargo binstallwill compile from source unless you provide prebuilt release assets and setrepositoryin the crate metadata.
Model & Auth
--model <id>HF Hub model id (default:gpt2)--revision <rev>HF revision (default:main)--hf-token <token>access token for private models
Modes
--mode <english|chinese|text>--text <text>text mode input--text-file <path>text mode input from file--loop-textloop text forever--repeat <n>repeat text n times
Rate Control
--rate <n>target tokens/sec--rate-min <n>min rate for random range--rate-max <n>max rate for random range--rate-sample-interval <n>sampling interval for rate range (seconds, default: 1)--batch <n>tokens emitted per batch--max-tokens <n>stop after n tokens
Pacing & Throughput
--pace <strict|sleep>pacing mode (default:strict)--spin-threshold-us <n>busy‑spin threshold forstrictmode--no-throttledisable pacing (measure max throughput)--no-outputdisable stdout output (closer to tokenizer upper bound)
Stats
--no-statsdisable stats output (stderr)--stats-interval <n>stats interval seconds (default: 1)
Random Output Filters
--no-skip-specialdo not skip special tokens--allow-digits--allow-punct--allow-space--allow-non-ascii--no-require-letter--no-require-cjk
Seed
--seed <n>random seed
Examples
# Random rate range sampled every 2 seconds
# Text mode from file, repeat 5 times
# Infinite loop text
# Throughput upper bound (no throttle, no output)
npx CLI
Quick Start
For local development in this repo:
Supported Flags (npx)
--model <id>--revision <rev>--hf-token <token>(or envHF_TOKEN/HUGGINGFACE_HUB_TOKEN)--mode <english|chinese|text>--text <text>--loop(loop text forever)--repeat <n>--rate <n>--rate-min <n>/--rate-max <n>--rate-sample-interval <n>--seed <n>--max-tokens <n>--no-skip-special--allow-digits/--allow-punct/--allow-space/--allow-non-ascii--no-require-letter/--no-require-cjk--no-stats/--stats-interval <n>--no-throttle/--no-output--web --port <n>
Notes:
--loop-text,--text-file,--batch,--pace, and--spin-threshold-usare Rust‑CLI only.
Web Demo
# open http://localhost:8787
While running, you can drag the rate slider or enable random rate range. The page shows target and actual throughput. The output pane is fixed‑height and scrolls independently.
Accuracy Notes
- Rust CLI
strictuses sleep + short spin for high precision. - Web / npx are best‑effort due to event loop and I/O limits.
- If actual throughput doesn’t change while raising target rates, you likely hit tokenizer limits.
- For maximum throughput testing, use the Rust CLI with
--no-output --no-throttle.
Build WASM (optional refresh)
WASM artifacts are committed and included in the npm package.
just Recipes
Tests
License
MIT