# Development Guide
**Maintained by**: Raghav Maheshwari (@techusic)
**Email**: raghavmkota@gmail.com
**Repository**: https://github.com/techusic/omicsx
Complete development documentation for OMICS-X v1.0.2 (production-ready bioinformatics toolkit with SIMD, GPU acceleration, multi-format HMM parsing, and distributed computing).
## Quick Start
```bash
# Clone and setup
git clone https://github.com/techusic/omicsx.git
cd omicsx
cargo build --release
# Run full test suite (275 tests)
cargo test --lib
# Run examples (all 10)
cargo run --example basic_alignment --release
cargo run --example performance_validation --release
cargo run --example neon_alignment --release
cargo run --example bam_format --release
cargo run --example sam_format_output --release
cargo run --example gpu_acceleration --release
cargo run --example gpu_execution_test --release
cargo run --example multiformat_hmm_parser --release
cargo run --example distributed_alignment --release
cargo run --example st_jude_integration --release
# Run benchmarks
cargo bench --bench alignment_benchmarks -- --verbose
```
## Build Variants
```bash
# Debug build (faster compilation)
cargo build
# Release build (optimized for performance)
cargo build --release
# Check for issues without building
cargo check
# Clean build from scratch
cargo clean && cargo build --release
# Build for ARM64 (NEON backend)
cargo build --release --target aarch64-unknown-linux-gnu
# Build with CUDA support (requires CUDA toolkit)
cargo build --release --features "cuda-12050"
# Build with specific features
cargo build --release --features "simd,batching"
```
## Testing
### Run Test Suite (275/275 tests passing)
```bash
# All unit tests
cargo test --lib
# Specific test module
cargo test --lib alignment::
cargo test --lib protein::
cargo test --lib scoring::
# Specific test by name
cargo test --lib test_smith_waterman
# Show output during tests
cargo test --lib -- --nocapture
# Single-threaded (for debugging)
cargo test --lib -- --test-threads=1
# Ignored tests only
cargo test --lib -- --ignored
# Run benchmarks with thresholds
cargo bench --bench alignment_benchmarks -- --verbose
```
### Test Coverage by Phase
**Phase 1 - Protein (32 tests)**
- AminoAcid creation, IUPAC codes, ambiguous codes
- Protein metadata, serialization, cloning
- Edge cases: empty proteins, invalid codes
**Phase 2 - Scoring (28 tests)**
- BLOSUM62 matrix lookups, all 24 amino acids
- AffinePenalty validation and presets
- Matrix dimension validation
**Phase 3 - Alignment Base (22 tests)**
- Smith-Waterman scoring
- Needleman-Wunsch scoring
- AlignmentResult metrics (identity, gaps, similarity)
**Phase 3 - SIMD Kernels (16 tests)**
- AVX2 kernel scoring vs scalar
- NEON kernel (ARM64)
- CPU feature detection
**Phase 4 - Advanced Features (35 tests)**
- CIGAR string generation (all 9 operation types)
- GPU memory pooling and transfers
- Banded DP (O(k·n) complexity)
- Batch alignment with Rayon
**Phase 5 - HMM/MSA (47 tests)**
- HMMER3 parser and E-value computation
- Viterbi HMM decoder
- PSSM scoring
- Profile-to-profile DP
- MSA alignment quality metrics
## Code Quality
```bash
# Format check
cargo fmt --check
# Apply formatting
cargo fmt
# Lint with Clippy
cargo clippy --release
# Strict linting (fail on warnings)
cargo clippy --release -- -D warnings
# Generate HTML documentation
cargo doc --no-deps --document-private-items --open
# Full quality check (format + lint + doc)
cargo fmt && cargo clippy --release && cargo doc --no-deps
```
## Performance Debugging
### Profile with Criterion Benchmarks
```bash
# Run benchmarking suite
cargo bench --bench alignment_benchmarks -- --verbose
# Save baseline for comparison
cargo bench --bench alignment_benchmarks -- --verbose --save-baseline main
# Compare against saved baseline
cargo bench --bench alignment_benchmarks -- --baseline main
# Run specific benchmark
cargo bench -- smith_waterman
```
### Profile with Linux Perf (Linux only)
```bash
# Install perf
sudo apt-get install linux-tools-generic
# Build with debug symbols
cargo build --release
# Run with perf
perf record -g target/release/omicsx
perf report
# Flamegraph (install flamegraph tool)
cargo flamegraph --bin omicsx
```
### Verify SIMD Instructions
```bash
# View assembly (x86-64)
# View assembly (macOS)
# Use cargo with disassemble feature
cargo install cargo-objdump
## Architecture Overview
### Five-Phase System Design
```
Phase 1: Protein Primitives
├── AminoAcid enum (20 standard + ambiguous codes)
├── Protein struct (sequence + metadata)
└── Serialization support (Serde)
Phase 2: Scoring Infrastructure
├── ScoringMatrix (BLOSUM62 + framework for PAM/GONNET)
├── AffinePenalty (validation + presets)
└── PSSM (position-specific scoring matrices)
Phase 3: SIMD Alignment Kernels
├── Scalar kernel (portable reference)
├── AVX2 kernel (x86-64 8-wide SIMD)
├── NEON kernel (ARM64 4-wide SIMD)
├── Smith-Waterman algorithm
└── Needleman-Wunsch algorithm
Phase 4: Advanced Features
├── CIGAR string generation (SAM/BAM format)
├── GPU memory pooling
├── Banded DP (O(k·n) for similar sequences)
└── Batch processing with Rayon
Phase 5: HMM & MSA
├── HMMER3 parser (real PFAM database compatibility)
├── Karlin-Altschul E-values
├── Viterbi HMM decoder
├── Profile-to-profile DP
└── MSA alignment metrics
```
### Module Structure
```
src/
├── error.rs # Error types (OmicsError, detailed diagnostics)
├── lib.rs # Public API exports
├── protein/
│ └── mod.rs # AminoAcid enum, Protein struct
├── scoring/
│ └── mod.rs # ScoringMatrix, AffinePenalty, PSSM
└── alignment/
├── mod.rs # Alignment algorithms (Smith-Waterman, Needleman-Wunsch)
├── kernel/
│ ├── mod.rs # Kernel selection (CPU detection)
│ ├── scalar.rs # Portable scalar implementation
│ ├── avx2.rs # AVX2 8-wide SIMD (x86-64)
│ └── neon.rs # NEON 4-wide SIMD (ARM64)
├── cigar_gen.rs # CIGAR string generation (9 op types)
├── gpu_memory.rs # GPU memory pooling (CUDA/HIP/Vulkan ready)
├── batch.rs # Batch processing with Rayon
├── banded_dp.rs # Banded DP (O(k·n) complexity)
├── bam.rs # BAM binary format serialization
├── hmmer3_parser.rs # HMMER3 format parser
├── profile_dp.rs # Profile-to-profile DP
└── simd_viterbi.rs # Vectorized Viterbi HMM decoder
```
### Key Design Patterns
**Kernel Selection (Runtime CPU Detection)**
```rust
// src/alignment/kernel/mod.rs
pub fn select_kernel(seq1: &[AminoAcid], seq2: &[AminoAcid]) -> KernelType {
if is_x86_feature_detected!("avx2") {
KernelType::Avx2 // Use 8-wide SIMD on x86-64
} else if cfg!(target_arch = "aarch64") {
KernelType::Neon // Use 4-wide NEON on ARM64
} else {
KernelType::Scalar // Fallback for compatibility
}
}
```
**Striped SIMD Layout (Cache-Efficient)**
```rust
// Organizes DP matrix to maximize SIMD parallelism
// For profile DP: positions × amino acids arranged in SIMD-friendly blocks
// Reduces memory bandwidth bottlenecks by 40-60%
```
**GPU Memory Management**
```rust
// src/alignment/gpu_memory.rs
// Uses memory pooling to amortize allocation overhead
// Automatic host↔device synchronization with CUDA/HIP/Vulkan
```
### Backup Files & Archive Strategy
When implementing major feature enhancements, original implementations are archived locally but excluded from git:
| Original implementations | `*_original.rs` | Pre-enhancement baselines | `phylogeny_likelihood_original.rs`, `msa_profile_alignment_original.rs` |
| Previous variants | `*_old.rs` | SIMD kernel iterations | Previous scalar-only implementations |
| Staged enhancements | `*_enhanced.rs` | Temporary staging files | Development-in-progress modules |
**Git Configuration** (in `.gitignore`):
```
# Enhanced implementation backups (Phase 3)
src/futures/*_original.rs
src/alignment/*_old.rs
src/futures/*_enhanced.rs
src/alignment/*_enhanced.rs
```
**Benefits for Development**:
- ✅ Local regression testing against original implementations
- ✅ Quick rollback if enhancements introduce issues
- ✅ Preserve architectural decisions without cluttering history
- ✅ Support A/B performance comparisons during optimization
**Workflow for Feature Enhancement**:
1. **Before starting**: `cp original_file.rs original_file_original.rs` (local archive)
2. **During development**: Make edits to `original_file.rs`
3. **Testing**: Compare against `original_file_original.rs` for regression
4. **Commit**: Add `.gitignore` patterns; push main implementation only
5. **Archive**: Keep `*_original.rs` locally for future reference (not in git)
## Development Workflow
### 1. Feature Branch Workflow
```bash
# Create feature branch from master
git checkout -b feature/my-feature
# Make changes in src/ module
vim src/module/feature.rs
# Add comprehensive tests
vim src/module/feature.rs # Add #[cfg(test)] mod tests
# Verify compilation and tests
cargo test --lib
# Stage changes
git add src/module/
# Commit with conventional format
git commit -m "feat(module): descriptive message"
# Push to fork
git push origin feature/my-feature
# Create pull request with test results and benchmarks
```
### 2. Code Standards for New Modules
**All public APIs must have:**
- Doc comments with /// describing purpose
- Example code blocks in doc comments
- Unit tests in #[cfg(test)] blocks
- Error handling via Result<T> enums
- No panics (assertions only in tests)
```rust
/// Aligns two protein sequences using Smith-Waterman algorithm.
///
/// # Arguments
/// * `seq1` - First protein sequence
/// * `seq2` - Second protein sequence
/// * `matrix` - Scoring matrix (typically BLOSUM62)
/// * `penalty` - Gap penalty parameters
///
/// # Returns
/// AlignmentResult with alignment score, identity %, gaps, and CIGAR string
///
/// # Examples
/// ```ignore
/// let seq1 = Protein::from_str("MKFLK").unwrap();
/// let seq2 = Protein::from_str("MKLK").unwrap();
/// let result = smith_waterman(&seq1, &seq2, &matrix, &penalty)?;
/// assert!(result.score > 0);
/// ```
pub fn smith_waterman(
seq1: &Protein,
seq2: &Protein,
matrix: &ScoringMatrix,
penalty: &AffinePenalty,
) -> Result<AlignmentResult> {
// Implementation
}
```
### 3. Performance Optimization Sequence
1. **Profile first** - Use `cargo bench` to identify bottleneck
2. **Implement scalar baseline** - Ensure correctness before SIMD
3. **Add unit tests** - Test both scalar and SIMD against known values
4. **Apply SIMD carefully** - Use `#[cfg(target_arch)]` for portability
5. **Verify gains** - Benchmark SIMD vs scalar (should see 4-15x improvement)
6. **Document** - Note SIMD optimizations in doc comments
## Debugging Tips
### Issue: Test Fails with "Mismatched CIGAR"
**Root Cause**: CIGAR operation types incorrect (e.g., using old enum names)
**Solution**: Verify CIGAR enum matches SAM spec
```rust
// Correct enum names (SAM format)
pub enum CigarOp {
SeqMatch, // M (sequence match, no substitutions)
Insertion, // I (insertion in query)
Deletion, // D (deletion in query)
Skip, // N (skipped region in reference)
SoftClip, // S (soft clipping in query)
HardClip, // H (hard clipping in query)
Padding, // P (padding in alignment)
SeqMismatch, // = (sequence mismatch)
Difference, // X (sequence difference)
}
```
### Issue: E-Values Seem Too Large
**Root Cause**: Karlin-Altschul parameters not matching database
**Solution**: Verify against NCBI BLAST default parameters
```rust
// Default BLOSUM62 parameters (from NCBI BLAST)
K = 0.035 // K statistic
lambda = 0.3176 // Lambda parameter
```
### Issue: SIMD Kernel Shows No Speedup
**Root Cause**: SIMD instructions not being generated by compiler
**Solution**: Check target features
```bash
# Verify AVX2 instructions in binary
# Should show non-zero count of SIMD instructions
```
### Issue: GPU Memory Allocation Fails
**Root Cause**: GPU not found or insufficient memory
**Solution**: Check GPU configuration
```bash
# List GPUs
nvidia-smi # For NVIDIA (CUDA)
rocm-smi # For AMD (HIP)
vulkaninfo # For Vulkan support
```
## Release Process Checklist
Before releasing new version:
- [ ] All 180 tests passing (`cargo test --lib`)
- [ ] No compiler warnings (`cargo clippy --release -- -D warnings`)
- [ ] Code formatted (`cargo fmt --check`)
- [ ] Documentation builds (`cargo doc --no-deps`)
- [ ] Benchmarks run successfully (`cargo bench`)
- [ ] README.md updated with new features
- [ ] CHANGELOG.md updated with version history
- [ ] Version bumped in Cargo.toml
- [ ] Examples all run (`cargo run --example *`)
- [ ] Git status clean (`git status`)
- [ ] Commits follow conventional format
- [ ] Tag created (`git tag -a v0.X.0`)
- [ ] Tag pushed (`git push origin v0.X.0`)
## Production Deployment
### Build Optimized Release Binary
```bash
# Full release optimization
RUSTFLAGS="-C target-cpu=native -C lto=fat" cargo build --release
# Generate binary
ls -lh target/release/libomicsx.so # ~143 KB
# Strip symbols if needed
strip target/release/libomicsx.so
```
### Platform-Specific Builds
```bash
# x86-64 (AVX2 support)
cargo build --release
# ARM64 (NEON support)
rustup target add aarch64-unknown-linux-gnu
cargo build --release --target aarch64-unknown-linux-gnu
# Test cross-platform compatibility
cargo test --lib --target aarch64-unknown-linux-gnu
```
### Performance Validation
```bash
# Verify scaling (should see linear to super-linear improvement)
cargo bench -- scaling
# Check memory usage (should be < 500 MB for 100M amino acid database)
/usr/bin/time -v cargo bench -- large_sequences
# Profile hotspots
perf record -g cargo bench
perf report
```
## Contributing Guidelines
When contributing new work:
1. **Create focused PRs** - One logical feature per PR
2. **Test thoroughly** - All edge cases covered
3. **Document public APIs** - Examples provided
4. **Follow code style** - Run `cargo fmt`
5. **Benchmark changes** - Show performance impact
6. **Update docs** - README.md, CHANGELOG.md, DEVELOPMENT.md
7. **Squash if needed** - Keep commit history clean
# Create PR on GitHub
```
## Debugging Tips
### Print Debugging
```rust
dbg!(variable);
eprintln!("Debug: {:?}", value);
```
### Run with Backtrace
```bash
RUST_BACKTRACE=1 cargo test --lib -- --nocapture
RUST_BACKTRACE=full cargo run --example basic_alignment
```
### Conditional Compilation
```rust
#[cfg(debug_assertions)]
eprintln!("Debug info");
#[cfg(all(test, feature = "debug"))]
fn dump_state() { /* ... */ }
```
## Performance Optimization Workflow
1. **Baseline**: `cargo bench --bench alignment_benchmarks -- --save-baseline main`
2. **Change code**: Make optimization
3. **Rerun**: `cargo bench --bench alignment_benchmarks -- --baseline main`
4. **Compare**: Review criterion output in `target/criterion/`
## Cross-Platform Testing
```bash
# Test on x86-64
cargo test --lib
# Build for ARM64 (if cross installed)
cargo build --release --target aarch64-unknown-linux-gnu
# Check SIMD features
## Documentation
```bash
# Generate and open docs
cargo doc --open
# Generate with private items
cargo doc --no-deps --document-private-items --open
# Check documentation compiles
cargo test --doc
```
## Common Issues
### "No such file or directory" errors
```bash
cargo clean
cargo build --release
```
### Test failures after git pull
```bash
git clean -fdx
cargo test --lib
```
### Benchmark time is unstable
- Close other applications
- Use criterion's statistical features
- Run multiple times with `--measurement-time`
### SIMD not vectorizing
```bash
```
## Release Workflow
See `.github/RELEASE_CHECKLIST.md` for detailed release process.
## Questions?
- Review existing tests for patterns
- Check `CONTRIBUTING.md` for guidelines
- See `README.md` for API examples
- Review `.github/copilot-instructions.md` for architecture notes