1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
//! Convenience library for reading and writing compressed files / streams
//!
//! `compress_io`` does not provide the compression/decompression itself but uses external utilities
//! such as [gzip], [bzip2] or [zstd] as read or write filters. The aim of `compress_io` is to make
//! it simple for an application to support multiple compression formats with a minimal effort
//! from the developer and also from the user (i.e., an application can accept uncompressed
//! or compressed input in a range of different formats and neither the developer nor the user
//! have to specify which formats have been used).
//!
//! ## Overview
//!
//! The main way to work with `compress_io` is via [`CompressIo`] (or [`AsyncCompressIo`] in the
//! case of `async` code). A reader (implementing [`Read`]), buffered reader (implementing
//! [`BufRead`]), writer or buffered writer (both implementing [`Write`]) can be generated from
//! [`CompressIo`] (or [`AsyncCompressIo`]). By default readers and writers use `stdin` and
//! `stdout`, but a file path can also be specified with [`path`]. By default `compress_io` will
//! detect the compression format of compressed input files automatically based on the initial
//! contents of the file/stream and select an appropriate utility if available in the users
//! `$PATH`, and the format of output files based on the file extension. These automatic methods
//! can be overridden by [`ctype`]. `compress_io` will make use of parallel versions of
//! compression utilities if available. By default the compression utilities will be run using
//! with the default threading options, but this behvaiour can be changed using [`cthreads`].
//!
//! ## Examples
//!
//! ```no_run
//! use std::io::{self, BufRead, Write};
//! use compress_io::compress::CompressIo;
//!
//! fn main() -> io::Result<()> {
//! // Read from a (presumably) gzipped file foo.gz and write out to file `foo.xz` which will be
//! // compressed using [xz] (assuming both [gzip] and [xz] are in the users Path.
//! // In this example both read and write streams are buffered
//! let mut reader = CompressIo::new().path("foo.gz").bufreader()?;
//! let mut writer = CompressIo::new().path("foo.xz").bufwriter()?;
//! for s in reader.lines().map(|l| l.expect("Read error")) {
//! writeln!(writer, "{}", s)?
//! }
//! Ok(())
//! }
//! ```
//!
//! Decompression utilities can be specified by the user, or can be selected automatically
//! based on an examination of the first few bytes of the input.
//!
//! ```no_run
//! # use std::io;
//! use compress_io::{
//! compress::CompressIo,
//! compress_type::CompressType,
//! };
//!
//! # fn main() -> io::Result<()> {
//! // Open a reader from `stdin`, using the first bytes from the file to determine whether the
//! // file is compressed or not
//! let mut rd1 = CompressIo::new().reader()?;
//! // Open a buffered reader from `foo.bz2` using [bzip2] to decompress
//! let mut rd2 = CompressIo::new().path("foo.bz2").ctype(CompressType::Bzip2).bufreader()?;
//! # Ok(())
//! # }
//! ```
//!
//! Compression utilities can also either be explicitly selected, or they can
//! be set automatically based on the file name (so a file called `test.zst` would be
//! compressed using the [zstd] utility). If the compression format is selected explicitly then
//! extension will be added to the filename unless the extension is already present, or the
//! [`fix_path`] option has been selected.
//!
//! ```no_run
//! # use std::io;
//! use compress_io::{
//! compress::CompressIo,
//! compress_type::CompressType,
//! };
//!
//! # fn main() -> io::Result<()> {
//! // Open a compressed writer to `stdout`, using [zstd] to compress the stream
//! let mut wrt1 = CompressIo::new().ctype(CompressType::Zstd).writer()?;
//! // Open a compressed buffered writer to the file `foo.lzma` using lzma to decompress
//! let mut wrt2 = CompressIo::new().path("foo").ctype(CompressType::Lzma).bufwriter()?;
//! # Ok(())
//! # }
//! ```
//!
//! Several of the possible compression formats can be
//! generated by multiple utilities, and this allows alternate utilities to be used if the
//! standard utility is not available.
//!
//! For example, the standard utility for *xz* compression
//! is the [xz] tool, however [zstd] can also perform *xz* compression and will be substituted by
//! the library if [xz] is not available. Note the if *bgzip* compression is
//! requested then only the [bgzip] utility will be used; even though *bgzip* compression is
//! compatible with the *gzip* format and can be decoded by any compressor that handles
//! *gzip*, extra information is added during compression by [bgzip] that other utilities
//! do not generate.
//!
//! For compression, certain of the utilities are multi-threaded. If multiple utilities are
//! available to perform a given compression type, preference will be given to multi-threaded
//! versions. For example, if *gzip* compression is requested and the [pigz] utility is available
//! in the current `$PATH` then this will be used in favour [gzip]. For compression the user can
//! specify a preference for threading (where available) using [`cthreads`].
//!
//! ```no_run
//! # use std::io;
//! use compress_io::{
//! compress::CompressIo,
//! compress_type::{CompressType, CompressThreads},
//! };
//!
//! # fn main() -> io::Result<()> {
//! // Open a compressed buffered writer to `foo.zstd`, using [zstd] to compress the stream
//! // using 4 threads
//! let mut wrt = CompressIo::new().ctype(CompressType::Zstd)
//! .cthreads(CompressThreads::Set(4)).bufwriter()?;
//! # Ok(())
//! # }
//! ```
//!
//! ## Usage
//!
//! For usage with synchronous code only, add `compress_io` as a dependency in your `Cargo.toml` to
//! use from crates.io:
//!
//! ```toml
//! [dependencies ]
//! compress_io = "0.2"
//! ```
//!
//! For use with asynchronous code then the `async` feature should be enabled:
//!
//! ```toml
//! [dependencies ]
//! compress_io = { version = "0.2", features = ["async"] }
//! ```
//!
//! [`CompressIo`]: crate::compress::CompressIo
//! [`AsyncCompressIo`]: crate::async::compress::AsyncCompressIo
//! [`path`]: crate::compress::CompressIo::path
//! [`ctype`]: crate::compress::CompressIo::ctype
//! [`cthreads`]: crate::compress::CompressIo::cthreads
//! [`fix_path`]: crate::compress::CompressIo::fix_path
//!
//! [`Read`]: std::io::Read
//! [`BufRead`]: std::io::BufRead
//! [`Write`]: std::io::Write
//!
//! [gzip]: http://www.gzip.org/
//! [bgzip]: https://www.htslib.org/doc/bgzip.html
//! [pigz]: https://www.zlib.net/pigz/
//! [bzip2]: https://sourceware.org/bzip2/
//! [zstd]: https://facebook.github.io/zstd/
//! [xz]: https://tukaani.org/xz/
//! [lzma]: https://tukaani.org/lzma/
extern crate lazy_static;