bio-rs
Rust/WASM tools for biological AI models.
bio-rs turns Python-born bio-AI models into portable, inspectable tools for
CLIs, browsers, servers, and agents.
Python is where many biological AI models are born. bio-rs is where the
model-facing tools around them become reproducible, agent-callable, and easier
to ship outside a research notebook.
bio-rs is open source under dual MIT OR Apache-2.0 licensing.
Why this exists
Bio-AI does not need Rust to replace Python research workflows. It needs a reliable tooling layer around model inputs, tokenizers, runners, browser demos, and agent interfaces.
Rust is useful here because it is good at:
- predictable CLI and server tools
- portable WASM/browser execution
- safe input contracts for biological data
- reproducible single-binary distribution
- long-running services and agent-callable tools
Current proof
The first target is intentionally small:
FASTA -> validated protein sequence -> token ids -> model-ready input
Currently implemented:
- FASTA parsing for one protein sequence
protein-20residue validation- lowercase sequence normalization
- ambiguous residue reporting for
X,B,Z,J,U, andO - invalid residue reporting
- token ids using a stable
protein-20order - JSON output for CLI/tool use
biors inspectandbiors tokenize
Not implemented yet:
- WASM bindings
- MCP/agent tools
- model inference runners
- external model tokenizer parity
- multi-FASTA batch processing
Quickstart
CLI (Rust)
Install the CLI:
Inspect a protein sequence:
Tokenize for AI model input:
Library (Rust)
Add to your project:
[]
= "0.1"
Distribution
The project is distributed across multiple ecosystems:
- crates.io:
biors(CLI),biors-core(Library) - npm:
biors(WASM bindings - coming soon) - PyPI:
biors(Python bindings - coming soon)
Checks
This repo keeps the local pre-commit path and CI strict. Before committing, run:
The check suite runs:
cargo fmt --checkcargo check --workspace --all-targets --all-featurescargo test --workspace --all-targets --all-featurescargo clippy --workspace --all-targets --all-features -- -D warnings
Local git hooks are stored in .githooks/. Enable them with:
Workspace Structure
The project is a monorepo managed under the packages/ directory:
packages/
rust/
biors/ Main CLI tool and unified entrypoint
biors-core/ Core protein parsing and tokenization logic
npm/ WebAssembly bindings for JavaScript/TypeScript
python/ High-performance Python bindings via PyO3
examples/
protein.fasta
Protein-20
The first alphabet is protein-20:
A C D E F G H I K L M N P Q R S T V W Y
Token ids follow that order, starting at 0.
Final goal
The long-term goal is to make useful biological AI models easier to package as portable tools:
- CLI tools for local workflows
- WASM tools for browsers and demos
- server components for production systems
- agent-callable interfaces for automated research workflows
The first milestone is not folding or training. It is the stable input layer that everything after it needs.