1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
//! [](https://github.com/althonos/uniprot.rs/stargazers)
//!
//! *Rust data structures and parser for the [UniprotKB database(s)].*
//!
//! [UniprotKB database(s)]: https://www.uniprot.org/
//!
//! [](https://github.com/althonos/uniprot.rs/actions)
//! [](https://codecov.io/gh/althonos/uniprot.rs)
//! [](https://choosealicense.com/licenses/mit/)
//! [](https://github.com/althonos/uniprot.rs)
//! [](https://crates.io/crates/uniprot)
//! [](https://docs.rs/uniprot)
//! [](https://github.com/althonos/uniprot.rs/blob/master/CHANGELOG.md)
//! [](https://github.com/althonos/uniprot.rs/issues)
//!
//! # 🔌 Usage
//!
//! All `parse` functions takes a [`BufRead`] implementor as the input.
//! Additionaly, if compiling with the [`threading`] feature, it will
//! require the input to be [`Send`] and `'static` as well. They will use
//! the [`uniprot::Parser`], which is either [`SequentialParser`] or
//! [`ThreadedParser`] depending on the compilation features.
//!
//! ## 🗄️ Databases
//!
//! ### UniProt
//!
//! The [`uniprot::uniprot::parse`] function can be used to obtain an iterator
//! over the entries ([`uniprot::uniprot::Entry`]) of a UniprotKB database in
//! XML format (either [SwissProt] or [TrEMBL]).
//!
//! ```rust
//! extern crate uniprot;
//!
//! let f = std::fs::File::open("tests/uniprot.xml")
//! .map(std::io::BufReader::new)
//! .unwrap();
//!
//! for r in uniprot::uniprot::parse(f) {
//! let entry = r.unwrap();
//! // ... process the UniProt entry ...
//! }
//! ```
//!
//! The XML format is compatible with the results returned by the UniProt API,
//! so you can also use the [`uniprot::uniprot::parse`] to parse search results:
//!
//! ```rust
//! extern crate ureq;
//! extern crate libflate;
//! extern crate uniprot;
//!
//! let query = "colicin";
//! let req = ureq::get("https://rest.uniprot.org/uniprotkb/search")
//! .set("Accept", "application/xml")
//! .query("query", &format!("reviewed:true AND {}", query))
//! .query("format", "xml")
//! .query("compress", "true");
//! let reader = libflate::gzip::Decoder::new(req.call().unwrap().into_reader()).unwrap();
//!
//! for r in uniprot::uniprot::parse(std::io::BufReader::new(reader)) {
//! let entry = r.unwrap();
//! // ... process the Uniprot entry ...
//! }
//! ```
//!
//! ### UniRef
//!
//! The [`uniprot::uniref::parse`] function can be used to obtain an iterator
//! over the entries ([`uniprot::uniref::Entry`]) of a UniRef database in XML
//! format ([UniRef100], [UniRef90], or [UniRef50]).
//!
//! ### UniParc
//!
//! The [`uniprot::uniparc::parse`] function can be used to obtain an iterator
//! over the entries ([`uniprot::uniparc::Entry`]) of a UniParc database in
//! XML format.
//!
//! ## 📦 Decoding Gzip
//!
//! If parsing a Gzipped file, you can use [`flate2::read::GzDecoder`] or
//! [`libflate::gzip::Decoder`] to decode the input stream, and then simply
//! wrap it in a [`BufferedReader`]. Note that [`flate2`] has slightly better
//! performance, but binds to C,, while [`libflate`] is a pure Rust
//! implementation.
//!
//! ## 📧 Downloading from FTP
//!
//! Uniprot is available from the two following locations: [ftp.ebi.ac.uk]
//! and [ftp.uniprot.org], the former being located in Europe while the
//! latter is in the United States. The `ftp` crate can be used to open
//! a connection and parse the databases on-the-fly: see the
//! [`uniprot::uniprot::parse`] example to see a code snippet.
//!
//! ## 📧 Downloading from HTTP
//!
//! If FTP is not available, note that the EBI FTP server can also be reached
//! using HTTP at [http://ftp.ebi.ac.uk]. This allows using HTTP libraries
//! instead of FTP ones to reach the release files.
//!
//!
//! # 📝 Features
//!
//! ## `threading` - _**enabled** by default_.
//!
//! The `threading` feature compiles the parser module in multi-threaded mode.
//! This feature greatly improves parsing speed and efficiency, but removes
//! any guarantee about the order the entries are yielded in.
//!
//! ## 📋 Changelog
//!
//! This project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html)
//! and provides a [changelog](https://github.com/althonos/uniprot.rs/blob/master/CHANGELOG.md)
//! in the [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) format.
//!
//! ## 📜 License
//!
//! This library is provided under the open-source
//! [MIT license](https://choosealicense.com/licenses/mit/).
//!
//! [http://ftp.ebi.ac.uk]: http://ftp.ebi.ac.uk
//! [ftp.ebi.ac.uk]: ftp://ftp.ebi.ac.uk
//! [ftp.uniprot.org]: ftp://ftp.uniprot.org
//! [`threading`]: #threading
//! [`flate2`]: https://docs.rs/flate2/
//! [`flate2::read::GzDecoder`]: https://docs.rs/flate2/latest/flate2/read/struct.GzDecoder.html
//! [`libflate`]: https://docs.rs/libflate/
//! [`libflate::gzip::Decoder`]: https://docs.rs/libflate/latest/libflate/gzip/struct.Decoder.html
//! [`BufRead`]: https://doc.rust-lang.org/std/io/trait.BufRead.html
//! [`BufferedReader`]: https://doc.rust-lang.org/std/io/struct.BufReader.html
//! [`Entry`]: ./model/struct.Entry.html
//! [`uniprot::uniprot::parse`]: ./uniprot/fn.parse.html
//! [`uniprot::uniref::parse`]: ./uniref/fn.parse.html
//! [`uniprot::uniparc::parse`]: ./uniparc/fn.parse.html
//! [`uniprot::uniprot::Entry`]: ./uniprot/struct.Entry.html
//! [`uniprot::uniref::Entry`]: ./uniref/struct.Entry.html
//! [`uniprot::uniparc::Entry`]: ./uniparc/struct.Entry.html
//! [`uniprot::Parser`]: ./type.Parser.html
//! [`SequentialParser`]: ./parser/struct.SequentialParser.html
//! [`ThreadedParser`]: ./parser/struct.ThreadedParser.html
//! [SwissProt]: https://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.xml.gz
//! [TrEMBL]: https://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_trembl.xml.gz
//! [UniRef100]: https://ftp.uniprot.org/pub/databases/uniprot/uniref/uniref100/uniref100.xml.gz
//! [UniRef90]: https://ftp.uniprot.org/pub/databases/uniprot/uniref/uniref90/uniref90.xml.gz
//! [UniRef50]: https://ftp.uniprot.org/pub/databases/uniprot/uniref/uniref50/uniref50.xml.gz
extern crate chrono;
extern crate crossbeam_channel;
extern crate lazy_static;
extern crate num_cpus;
extern crate quick_xml;
extern crate smartstring;
extern crate url;