1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85
//! Zet's overall flow is:
//! * Form a starting `ZetSet` from the lines of the first input file. Each line
//! in the set is represented by an `IndexMap` key. The `IndexMap` value
//! associated with each key is not part of the abstract set value but is
//! used for operational bookkeeping. The type of these bookkeeping values
//! depends on the operation being calculated and whether we're keeping track
//! of the number of times each line occurs or the number of files it occurs
//! in.
//! * Read the lines of each subsequent operand, updating the bookkeeping value
//! as needed in order to decide whether to insert lines into or delete lines
//! from the set.
//! * Output the lines of the resulting set, possibly annotated with count of
//! the number of times the line appears in the input or the number of files
//! the line appears in.
//!
//! Zet's structure is due to the following design decisions:
//! * We read the entire contents of the first input file into memory, so we can
//! borrow the `IndexMap` key that represents each of its lines rather than
//! allocating a `Vec<u8>` for each of them. This saves both time and memory,
//! on the assumption that few lines in the first file are duplicates.
//! * We do *not* read the entire contents of subsequent files. This can cost us
//! time in key allocation, but often saves both time and memory: `Intersect`
//! and `Diff` never allocate, since they only remove lines from the set, while
//! the other operation won't do extensive allocation in the fairly common case
//! where the second and subsequent input files have few lines not already
//! present in the first file.
//! * We start output with a Unicode byte order mark if and only the first input
//! file begins with a byte order mark.
//! * We strip the line terminator (either `\r\n` or `\n`) from the end of each
//! input line. On output, we use the line terminator found at the end of the
//! first line of the first input file.
//! * We process all input files before doing any output. (This is not
//! absolutely necessary for the `Union` operation — see the
//! [huniq](https://crates.io/crates/huniq) command. But it is for all other
//! Zet operations.)
//!
//! The `set` module provides the `ZetSet` structure. The `ZetSet::new` function
//! takes a `&[u8]` slice and a bookkeeping item used by the calling operation.
//! The call `ZetSet::new(slice, item)` returns an initialized `ZetSet` with:
//! * An `IndexMap` whose keys (lines) are borrowed from `slice` and initial
//! bookkeeping values equal to `item`, and possibly updated if seen multiple
//! times in the slice.
//! * A field that indicates whether `slice` started with a byte order mark.
//! * A field that holds the line terminator to be used, taken from the first
//! line of `slice`.
//!
//! For a `ZetSet` `z`,
//! * `z.insert_or_update(operand, item)` uses `IndexMap`'s `entry` method to
//! insert `item` as the value for lines in `operand` that were not already
//! present in `z`, or to call `v.update_with(item)` on the bookkeeping item
//! of lines that were present. Inserted lines are allocated, not borrowed, so
//! `operand` need not outlive `z`.
//! * `z.update_if_present(operand, item)` calls `v.update_with(file_number)`
//! on the bookkeeping item of lines in operand that are present in `z`,
//! ignoring lines that are not already present.
//! * Finally, `z.retain(keep)` retains lines for which
//! `keep(item.retention_value())` is true of the line's bookkeeping item.
//!
#![deny(
warnings,
clippy::all,
clippy::cargo,
clippy::pedantic,
trivial_casts,
trivial_numeric_casts,
unused_extern_crates,
unused_import_braces,
unused_qualifications,
unused_must_use
)]
#![allow(clippy::cargo)] // FIXME
#![allow(
clippy::items_after_statements,
clippy::missing_errors_doc,
clippy::semicolon_if_nothing_returned,
clippy::struct_excessive_bools
)]
#![cfg_attr(debug_assertions, allow(dead_code, unused_imports, unused_variables))]
pub mod args;
pub mod help;
pub mod operands;
pub mod operations;
pub mod set;
pub mod styles;