biors-0.2.0 is not a library.
bio-rs
Rust tools for validating protein FASTA input and tokenizing single-record and
multi-record FASTA files into stable protein-20 token ids.
Features
- FASTA parsing for one or more protein sequences
protein-20residue validation- lowercase sequence normalization
- ambiguous residue reporting for
X,B,Z,J,U, andO - invalid residue reporting
- JSON output from the CLI, including array output for multi-FASTA tokenization
Quickstart
Inspect a protein sequence:
Tokenize a protein sequence:
Tokenize a multi-FASTA file:
Use the Rust library:
[]
= "0.0.1"
Checks
The check suite runs cargo fmt, cargo check, cargo test, and cargo clippy
with warnings denied.
Workspace
packages/
rust/
biors/ CLI
biors-core/ FASTA parsing and tokenization library
examples/
multi.fasta
protein.fasta
Protein-20
A C D E F G H I K L M N P Q R S T V W Y
Token ids follow that order, starting at 0.
License
Dual licensed under MIT OR Apache-2.0.