orphos_core/lib.rs
1//! # Orphos Gene Finder - Rust Implementation
2//!
3//! A high-performance Rust implementation of the Orphos prokaryotic gene finding algorithm.
4//! This library provides both single genome and metagenomic gene prediction capabilities with
5//! support for parallel processing.
6//!
7//! ## Overview
8//!
9//! Orphos (Prokaryotic Dynamic Programming Gene-finding Algorithm) is an unsupervised machine
10//! learning method for finding genes in prokaryotic genomes. This Rust implementation maintains
11//! compatibility with the original C version while offering improved performance and safety.
12//!
13//! ## Features
14//!
15//! - **Single Genome Mode**: Train on a complete genome for optimal gene prediction
16//! - **Metagenomic Mode**: Predict genes in fragmented or mixed sequences
17//! - **Multiple Output Formats**: Support for GenBank, GFF, GCA, and SCO formats
18//! - **Parallel Processing**: Multi-threaded execution using Rayon
19//! - **Type Safety**: Compile-time guarantees for training states
20//!
21//! ## Quick Start
22//!
23//! ```rust,no_run
24//! use orphos_core::{OrphosAnalyzer, config::OrphosConfig};
25//!
26//! // Create analyzer with default configuration
27//! let mut analyzer = OrphosAnalyzer::new(OrphosConfig::default());
28//!
29//! // Analyze a genome sequence
30//! let results = analyzer.analyze_sequence(
31//! "ATGCGATCGATCG...",
32//! Some("MyGenome".to_string())
33//! )?;
34//!
35//! println!("Found {} genes", results.genes.len());
36//! # Ok::<(), orphos_core::types::OrphosError>(())
37//! ```
38//!
39//! ## Architecture
40//!
41//! The library uses a type-state pattern to ensure training is performed before gene prediction:
42//!
43//! ```rust,no_run
44//! use orphos_core::engine::{UntrainedOrphos, Orphos, Untrained};
45//! use orphos_core::config::OrphosConfig;
46//! use orphos_core::sequence::encoded::EncodedSequence;
47//!
48//! // Create an untrained analyzer
49//! let mut untrained = UntrainedOrphos::with_config(OrphosConfig::default())?;
50//!
51//! // Encode the sequence
52//! let encoded = EncodedSequence::without_masking(b"ATGCGATCGATCG...");
53//!
54//! // Train on the genome (type changes to TrainedOrphos)
55//! let trained = untrained.train_single_genome(&encoded)?;
56//!
57//! // Use the higher-level API to find genes
58//! use orphos_core::OrphosAnalyzer;
59//! let mut analyzer = OrphosAnalyzer::new(OrphosConfig::default());
60//! let results = analyzer.analyze_sequence("ATGCGATCGATCG...", None)?;
61//! println!("Found {} genes", results.genes.len());
62//! # Ok::<(), orphos_core::types::OrphosError>(())
63//! ```
64//!
65//! ## Module Organization
66//!
67//! - [`config`]: Configuration options for analysis
68//! - [`engine`]: Main analysis engine and training logic
69//! - [`types`]: Core data types and structures
70//! - [`results`]: Gene prediction results
71//! - [`sequence`]: Sequence encoding and manipulation
72//! - [`training`]: Training algorithms for gene models
73//! - [`node`]: Gene node management and scoring
74//! - [`algorithms`]: Core gene-finding algorithms
75//! - [`output`]: Output formatting for various file types
76//! - [`bitmap`]: Efficient sequence encoding utilities
77//!
78//! ## Output Formats
79//!
80//! The library supports multiple output formats configured via [`config::OutputFormat`]:
81//!
82//! - **GenBank (GBK)**: Rich feature annotation format
83//! - **GFF3**: General Feature Format version 3
84//! - **GCA**: Gene coordinate annotation
85//! - **SCO**: Simple coordinate output
86//!
87//! ## Error Handling
88//!
89//! All fallible operations return [`Result<T, OrphosError>`](types::OrphosError),
90//! providing detailed error information for:
91//!
92//! - Invalid sequences (too short, invalid characters)
93//! - I/O errors during file operations
94//! - Training failures
95//! - Configuration errors
96
97pub mod algorithms;
98pub mod bitmap;
99pub mod config;
100pub mod constants;
101pub mod engine;
102pub mod metagenomic;
103pub mod node;
104pub mod output;
105pub mod results;
106pub mod sequence;
107pub mod training;
108pub mod types;
109
110pub use engine::OrphosAnalyzer;
111
112use crate::{node::rbs_score, types::OrphosError};