sfwtools/lib.rs
1//! # Design
2//!
3//! In the spirit of Software Tools, the aim is to make components re-usable
4//! in three ways:
5//!
6//! 1. Implement core features as functions, so they can be re-used within Rust.
7//! These functions should generally return a `Result` type, so the caller
8//! can decide how to deal with the error.
9//! 2. Executable commands with a simple interface that typically act as thin
10//! wrappers around the library functions, or perhaps combine the library
11//! functions in interesting ways.
12//! 3. As well designed code that can be copied as repurposed when necessary.
13//!
14//! A fourth avenue may be explored, which is to adopt the
15//! [nushell](https://github.com/rjbs/Sweater) approach to transferring
16//! tabular data between commands.
17//!
18//! For a related project that also follows Software Tools in Rust, and
19//! may serve as an interesting comparison, see
20//! [Sweater](https://github.com/rjbs/Sweater).
21//! A more feature-rich project is [uutils coreutils](https://github.com/uutils/coreutils),
22//! which as the name suggests, is a Rust implementation analogous to
23//! GNU Coreutils.
24//!
25//! ## Functional facilities
26//!
27//! Higher-order-functions (HOFs) are frequently used to reduce code
28//! complexity, verbosity, and the risk of errors. Primary examples are
29//! `map`, `for_each` (like `map` but effectful), and `fold`. As pointed
30//! out in Software Tools, pp 21, *"The best programs are designed in
31//! terms of loosely coupled functions that each does a simple task."*
32//!
33//! Some other references that refelect functional programming values:
34//! - page 36, a discussion on `break`: the suggestions also coincide largley
35//! with recursive functions.
36//! - pages 44-45 discuss defensive programming by guarding control variables
37//! with safety checks. In functional programming, such control variables
38//! often do not appear, so safety checks are unnecessary due to the usage
39//! of HOFs being safe by design. Page 45 also points out that non-voluminous
40//! code listings are easier to debug (which I agree with, and a functional
41//! style typically enables this), though we also want to warn against making
42//! code overly terse. Experience is the best guide in this case.
43//!
44//! ## Currently Implemented Tools
45//! - [x] `cp`
46//! - [x] `wc`
47//! - [x] `detab`
48//! - [x] `entab`
49//! - [x] `echo`
50//! - [x] `compress`
51//! - [ ] `expand`
52//!
53//! ## Dependencies
54//!
55//! Since the goal is to make the software both as self-contained and
56//! as illustrative as possible, we've tried to rely on very few dependencies.
57//! The following exceptions exist:
58//!
59//! - [fp-core](https://docs.rs/fp-core)
60//! This is what one would typically find as part the standard library
61//! in a functional language, so we have included it here. Though Rust is functional
62//! in a sense — it has lambda functions (i.e. Rust closures) and the stand library
63//! has many higher-order functions (HOFs) — its standard library doesn't include
64//! traits that are commonly found to be helpful abstracts in functional languages.
65//! We will use a few of these where it is particularly illustrative or sensible,
66//! but will stick with idiomatic Rust where that is obviously simpler.
67//! An interesting note is that filters are the subject of chapter 2 and much of
68//! the rest of the book, which are just a particular class of HOFs.
69//! - [peeking_take_while](https://docs.rs/peeking_take_while/)
70//! A small library that provides the `peeking_take_while` function for
71//! `Peekable` iterators. This behaves more of how would would expect for
72//! a `take_while` function compared to the standard `take_while` implementation,
73//! which will "lose" the first element after a `take_while` streak ends.
74//! - [tailcall](https://docs.rs/tailcall)
75//! This is a macro that enables tailcall elimination for functions that are
76//! tail recursive. In other words, instead of writing loops, we can sometimes
77//! just write a function that calls itself. Without this macro, such functions
78//! would eventually cause the stack to blow up.
79//! - [seahorse](https://docs.rs/seahorse)
80//! Seahorse is a minimal argument parser. Judging by some results
81//! returned by Google, [clap](https://clap.rs) is far more popular, but
82//! has additional dependencies; we are striving for being as portable
83//! as possible, so the minimality seemed to line up with that
84//! goal. Additionally, Clap doesn't appear to allow passing in argument
85//! lists directly, which is useful for maintaining separate commands
86//! that build on each other. In any case, argument parsing is only used
87//! very late in the application logic, and most of the API could be used
88//! without worrying about it.
89//!
90//! ### Currently unused
91//!
92//! - [byteorder](docs.rs/byteorder) Library for reading/writing numbers
93//! in big-endian and little-endian. This is a somewhat low-level library,
94//! but as this is an IO-heavy library of tools, it may make sense to rely
95//! on it.
96//! - [im](https://docs.rs/im)
97//! Immutable data structures that implement structural sharing can be
98//! even more performant than `std`'s mutable structures for large
99//! data types, and while Rust makes mutation far safer than most languages,
100//! mutation can still result in confusion at times, so in the cases where
101//! clarity is more important than performance (or performance doesn't
102//! matter much, e.g. one-ops), it may be preferable to use immutable data
103//! structures.
104//!
105//!
106//! ## Build
107//!
108//! ## Misc Notes
109//!
110//! ### Using todo!() to
111//!
112//! Using `todo!()` from `std::todo` is a helpful way to incrementally
113//! develop a feature while still getting feedback from the
114//! compiler. [**TODO**: show example]
115//!
116//! A [caveat](https://github.com/rust-lang/rfcs/issues/3045) is that
117//! currently you need code in the function after the `todo!()`, even
118//! if it doesn't match the type. For instance, we can use a function
119//! like:
120//!
121//! ```
122//! pub fn some_num() -> i32 {
123//! todo!(); ();
124//! }
125//! ```
126//!
127//! Most beneficial is that `rustc` will warn you if you a `todo!()` is
128//! left in your code, since it would result in a panic if that execution
129//! path were to occur.
130//!
131//! ### Rust on nix
132//!
133//! ```plain
134//! nix-shell -p rustup cargo
135//!
136//!
137//! ```
138//!
139//! ### Optimizing for size
140//!
141//! * https://github.com/johnthagen/min-sized-rust
142//!
143//! Currently, to generate small builds the following commands
144//! are required.
145//!
146//! 1. (only once per environment) Make source code for the standard library available:
147//!
148//! ```plain
149//! rustup component add rust-src --toolchain nightly
150//! ```
151//!
152//! 2.
153//!
154//! ```plain
155//! cargo +nightly build -Z build-std --target x86_64-unknown-linux-gnu --release
156//! ```
157//!
158//! 3. (optional) `strip` binary - see links in notes
159//!
160
161//!
162//! ## Project administration
163//!
164//! ### Git hooks
165//!
166//! #### Cargo-Husky
167//!
168//! We use [cargo-husky](https://github.com/rhysd/cargo-husky) to keep in
169//! line; it enforces several checks with a `pre-push` hook. Sometimes it
170//! is a bit restrictive, so if we need to push
171//! in-progress work to a branch, we can use
172//! `git push --no-verify -u origin feature_branch`.
173//! Cargo-husky expects certain files to be at the root of the repository,
174//! thus the symlinks.
175//!
176//! #### pre-commit
177//!
178//! We include the following, less stringent checks for pre-commit.
179//!
180//! ```bash
181//! #!/bin/sh
182//!
183//! # Put in your Rust repository's .git/hooks/pre-commit to ensure you never
184//! # breaks rustfmt.
185//! #
186//! # WARNING: rustfmt is a fast moving target so ensure you have the version that
187//! # all contributors have.
188//!
189//! for FILE in `git diff --cached --name-only`; do
190//! if [[ -f "$FILE" ]] && [[ $FILE == *.rs ]] \
191//! && ! rustup run nightly rustfmt --unstable-features \
192//! --skip-children $FILE; then
193//! echo "Commit rejected due to invalid formatting of \"$FILE\" file."
194//! exit 1
195//! fi
196//! done
197//!
198//! cd Rust/sfw-tools && cargo readme > README.md && git add README.md
199//! ```
200//! As can be seen this also generates the README from doc comments in `lib.rs`.
201//!
202
203#![deny(unused_must_use)]
204
205use std::env;
206use std::io::Error;
207
208use seahorse::{App, Command, Context};
209
210pub mod bytes_iter;
211pub use bytes_iter::BytesIter;
212
213pub mod constants;
214pub use constants::*;
215
216pub mod error;
217pub use error::*;
218
219pub mod iter_extra;
220pub use iter_extra::*;
221
222pub mod util;
223pub use util::*;
224
225// Following are re-exports for specific functionality //
226
227pub mod copying;
228pub use copying::*;
229
230pub mod counting;
231pub use counting::*;
232
233pub mod tabs;
234pub use tabs::*;
235
236pub mod compression;
237pub use compression::*;
238
239pub fn get_args() -> Result<(String, Vec<String>), Error> {
240 let mut args_in = env::args();
241 let cmd = args_in.next().sfw_err("Impossible: no first arg!")?;
242 let args_out: Vec<String> = args_in.collect::<Vec<String>>();
243 Ok((cmd, args_out))
244}
245
246/// This is a wrapper around the Seahorse `App.run` that emits
247/// a nicer user error message if there are no arguments provided.
248pub fn run_app(app: App, args: Vec<String>, arg_err: &str) {
249 match args.len() {
250 0 => user_exit(&format!("{}: Zero arguments in run_app", arg_err)),
251 _ => app.run(args),
252 }
253}
254
255pub fn echo(args: &[String]) {
256 println!("{}", args.join(" "))
257}
258
259pub fn echo_app() -> App {
260 App::new("echo")
261 .author("Brandon Elam Barker")
262 .action(run_echo_seahorse_action)
263 .command(run_echo_seahorse_cmd())
264}
265const ECHO_USAGE: &str = "echo [STRING]";
266
267pub fn run_echo_seahorse_cmd() -> Command {
268 Command::new("echo")
269 .description("echo: prints input arguments separated by a space")
270 .usage(ECHO_USAGE)
271 .action(run_echo_seahorse_action)
272}
273
274pub fn run_echo_seahorse_action(ctxt: &Context) {
275 echo(&ctxt.args);
276}