# srgn - a code surgeon
A code **s**u**rg**eo**n** for precise text and code transplantation.
Born a Unicode-capable [descendant of `tr`](#comparison-with-tr), `srgn` adds useful
[*actions*](#actions), acting within precise, optionally language grammar-aware
[*scopes*](#scopes). It suits use cases where...
- regex [doesn't cut
it](https://en.wikipedia.org/wiki/Pumping_lemma_for_regular_languages) anymore,
- editor tools such as *Rename all* are too specific, and not automatable,
- precise manipulation, not just matching, is required, and lastly and optionally,
- Unicode-specific trickery is desired.
## Usage
For an "end-to-end" example, consider this Python snippet ([more languages are
supported](#showcases)):
```python gnu.py
"""GNU module."""
def GNU_says_moo():
"""The GNU -> say moo -> ✅"""
GNU = """
GNU
""" # the GNU...
print(GNU + " says moo") # ...says moo
```
which with an invocation of
```bash
can be manipulated to read
```python output-gnu.py
"""GNU 🐂 is not Unix module."""
def GNU_says_moo():
"""The GNU → say moo → ✅"""
GNU = """
GNU
""" # the GNU...
print(GNU + " says moo") # ...says moo
```
which demonstrates:
- language grammar-aware operation: only Python docstrings were manipulated; virtually
impossible to replicate in just regex
Skip ahead to **[more such showcases](#showcases) below**.
- advanced regex features such as, in this case, negative lookbehind are supported
- Unicode is natively handled
- features such as [ASCII symbol replacement](#symbols) are provided
Hence the concept of surgical operation: `srgn` allows you to be quite precise about the
scope of your actions, *combining* the power of both [regular
expressions](https://docs.rs/fancy-regex/latest/fancy_regex/index.html) and
[parsers](https://tree-sitter.github.io/tree-sitter/).
> [!NOTE]
>
> Without exception, all `bash` and `console` code snippets in this README are
> automatically [tested](tests/readme.rs) using the actual program binary, facilitated
> by a tiny bash interpreter. What is showcased here is guaranteed to work.
## Installation
### Prebuilt binaries
Download a prebuilt binary from the
[releases](https://github.com/alexpovel/srgn/releases/latest).
### cargo-binstall
This crate provides its binaries in a format
[compatible](https://github.com/cargo-bins/cargo-binstall/blob/9cfc0cd5f97300925ae60f67712b74970a380aca/SUPPORT.md#support-for-cargo-binstall)
with [`cargo-binstall`](https://github.com/cargo-bins/cargo-binstall):
1. Install the [Rust toolchain](https://www.rust-lang.org/tools/install)
2. Run `cargo install cargo-binstall` (might take a while)
3. Run `cargo binstall srgn` (couple seconds, as it downloads [prebuilt
binaries](#prebuilt-binaries) from GitHub)
These steps are guaranteed to work™, as they are [tested in
CI](./.github/workflows/main.yml). They also work if no prebuilt binaries are available
for your platform, as the tool will fall back to [compiling from
source](#cargo-compile-from-source).
### CI (GitHub Actions)
All [GitHub Actions runner
images](https://github.com/actions/runner-images/tree/main/images) come with `cargo`
preinstalled, and `cargo-binstall` provides a convenient [GitHub
Action](https://github.com/marketplace/actions/install-cargo-binstall):
```yaml
jobs:
srgn:
name: Install srgn in CI
# All three major OSes work
runs-on: ubuntu-latest
steps:
- uses: cargo-bins/cargo-binstall@main
- name: Install binary
run: >
cargo binstall
--no-confirm
srgn
- name: Use binary
run: srgn --version
```
The above concludes in just [5 seconds
total](https://github.com/alexpovel/srgn/actions/runs/6605290729/job/17940329899), as no
compilation is required. For more context, see [`cargo-binstall`'s advise on
CI](https://github.com/cargo-bins/cargo-binstall#can-i-use-it-in-ci).
### Cargo (compile from source)
1. Install the [Rust toolchain](https://www.rust-lang.org/tools/install)
2. A C compiler is required:
1. On Linux, `gcc` works (tested).
2. On macOS, try `clang` (untested).
3. On Windows, [MSVC](https://visualstudio.microsoft.com/downloads/) works (tested).
Select "Desktop development with C++" on installation.
3. Run `cargo install srgn`
### Cargo (as a Rust library)
```text
cargo add srgn
```
See [here](#rust-library) for more.
## Walkthrough
The tool is designed around **scopes** and **actions**. Scopes narrow down the parts of
the input to process. Actions then perform the processing. Generally, both scopes and
actions are composable, so more than one of each may be passed. Both are optional (but
taking no action is pointless); specifying no scope implies the entire input is in
scope.
At the same time, there is considerable overlap with plain [`tr`][tr]: the tool is
designed to have close correspondence in the most common use cases, and only go beyond
when needed.
### Actions
The simplest action is replacement. It is specially accessed (as an argument, not an
option) for compatibility with [`tr`][tr], and general ergonomics. All other actions are
given as flags, or options should they take a value.
#### Replacement
For example, simple, single-character replacements work as in [`tr`][tr]:
```console
```
The first argument is the scope (literal `H` in this case). Anything matched by it is
subject to processing (replacement by `J`, the second argument, in this case). However,
there is **no direct concept of character classes** as in [`tr`][tr]. Instead, by
default, the scope is a regular expression pattern, so *its*
[classes](https://docs.rs/regex/1.9.5/regex/index.html#character-classes) can be used to
similar effect:
```console
capture group match* are treated *individually* for processing, allowing a replacement
to be repeated:
```console
performance guarantees](https://docs.rs/regex/latest/regex/#untrusted-input). If they
aren't used, [performance is not
impacted](https://docs.rs/fancy-regex/0.11.0/fancy_regex/index.html#).
The replacement is not limited to a single character. It can be any string, for example
to fix [this quote](http://regex.info/blog/2006-09-15/247):
```console
```
The tool is fully Unicode-aware, with useful support for [certain advanced
character
classes](https://github.com/rust-lang/regex/blob/061ee815ef2c44101dba7b0b124600fcb03c1912/UNICODE.md#rl12-properties):
```console
$ echo 'Mood: 🤮🤒🤧🦠 :(' | srgn '\p{Emoji_Presentation}' '😷'
Mood: 😷😷😷😷 :(
```
#### Beyond replacement
Seeing how the replacement is merely a static string, its usefulness is limited. This is
where [`tr`'s secret sauce](https://maizure.org/projects/decoded-gnu-coreutils/tr.html)
ordinarily comes into play: using its character classes, which are valid in the second
position as well, neatly translating from members of the first to the second. Here,
those classes are instead regexes, and only valid in first position (the scope). A
regular expression being a state machine, it is impossible to match onto a 'list of
characters', which in `tr` is the second (optional) argument. That concept is out the
window, and its flexibility lost.
Instead, the offered actions, all of them **fixed**, are used. A peek at [the most
common use cases for `tr`](#use-cases-and-equivalences) reveals that the provided set of
actions covers virtually all of them! Feel free to file an issue if your use case is not
covered.
Onto the next action.
#### Deletion
Removes whatever is found from the input. Same flag name as in `tr`.
```console
```
> [!NOTE]
> As the default scope is to match the entire input, it is an error to specify
> deletion without a scope.
#### Squeezing
Squeezes repeats of characters matching the scope into single occurrences. Same flag
name as in `tr`.
```console
```
If a character class is passed, all members of that class are squeezed into whatever
class member was encountered first:
```console
```
Greediness in matching is not modified, so take care:
```console
```
> [!NOTE]
> The pattern matched the *entire* run of suns, so there's nothing to squeeze. Summer
> prevails.
Invert greediness if the use case calls for it:
```console
```
> [!NOTE]
> Again, as with [deletion](#deletion), specifying squeezing without an *explicit* scope
> is an error. Otherwise, the entire input is squeezed.
#### Character casing
A good chunk of `tr` usage [falls into this category](#changing-character-casing). It's
very straightforward.
```console
$ echo 'Hello, World!' | srgn --upper
HELLO, WORLD!
$ echo 'hello, world!' | srgn --titlecase
Hello, World!
```
#### Normalization
Decomposes input according to [Normalization Form
D](https://en.wikipedia.org/wiki/Unicode_equivalence#Normal_forms), and then discards
code points of the [Mark
category](https://en.wikipedia.org/wiki/Unicode_character_property#General_Category)
(see [examples](https://www.compart.com/en/unicode/category/Mn)). That roughly means:
take fancy character, rip off dangly bits, throw those away.
```console
$ echo 'Naïve jalapeño ärgert mgła' | srgn --normalize # Normalize is smarter
Naive jalapeno argert mgła
```
Notice how `mgła` is out of scope for NFD, as it is not decomposable (at least that's
what ChatGPT whispers in my ear).
#### Symbols
This action replaces multi-character, ASCII symbols with appropriate single-code point,
native Unicode counterparts.
```console
```
Alternatively, if you're only interested in math, make use of scoping:
```console
```
As there is a [1:1 correspondence](https://en.wikipedia.org/wiki/Bijection) between an
ASCII symbol and its replacement, the effect is reversible```console
```
There is only a limited set of symbols supported as of right now, but more can be added.
#### German
This action replaces alternative spellings of German special characters (ae, oe, ue, ss)
with their native versions (ä, ö, ü, ß).
```console
```
This action is based on a word list.
> [!NOTE]
>
> - empty scope and replacement: the entire input will be processed, and no replacement
> is performed
> - `Poeten` remained as-is, instead of being naively and mistakenly converted to
> `Pöten`
> - as a (compound) word, `Abenteuergrütze` is not going to be found in [any reasonable
> word list](https://www.duden.de/suchen/dudenonline/Stinkegr%C3%BCtze), but was
> handled properly nonetheless
On request, replacements may be forced, as is potentially useful for names:
```console
```
Through positive lookahead, nothing but the salutation was scoped and therefore changed.
`Mauer` correctly remained as-is, but `ueber` was not processed. A second pass fixes
this:
```console
```
> [!NOTE]
>
> Options and flags pertaining to some "parent" are prefixed with their parent's name,
> and will *imply* their parent when given, such that the latter does not need to be
> passed explicitly. That's why `--german-naive` is named as it is, and `--german`
> needn't be passed.
>
> This behavior might change once `clap` supports [subcommand
> chaining](https://github.com/clap-rs/clap/issues/2222).
Some branches are undecidable for this modest tool, as it operates without language
context. For example, both `Busse` (busses) and `Buße` (penance) are legal words. By
default, replacements are greedily performed if legal (that's the [whole
point](https://en.wikipedia.org/wiki/Principle_of_least_astonishment) of `srgn`,
after all), but there's a flag for toggling this behavior:
```console
$ echo 'Busse 🚌 und Fussgaenger 🚶♀️' | srgn --german-prefer-original
Busse 🚌 und Fußgänger 🚶♀️
```
### Combining Actions
Most actions are composable, unless it would be nonsensical to do so (like for
[deletion](#deletion)). Their order of application is fixed, so the *order* of the flags
given has no influence (piping multiple runs is an alternative, if needed). Replacements
always occur first. Generally, the CLI is designed to prevent misuse and surprises: it
prefers crashing to doing something unexpected. Note that lots of combinations *are*
technically possible, but might yield nonsensical results.
```console
```
A more narrow scope can be specified, and will apply to *all* actions equally:
```console
```
The [word boundaries](https://www.regular-expressions.info/wordboundaries.html) are
required as otherwise `Koeffizienten` is matched as `Koeffizi` and `enten`. Note how the
trailing periods cannot be, for example, squeezed. The required scope of `\.` would
interfere with the given one. Regular piping solves this:
```console
```
The specially treated replacement action is also composable:
```console
```
Emojis are first all replaced, then squeezed. Notice how nothing else is squeezed.
### Scopes
Scopes are the second driving concept to `srgn`. In the default case, the main scope is
a regular expression. The [actions](#actions) section showcased this use case in some
detail, so it's not repeated here. It is given as a first positional argument.
#### Language grammar-aware scopes
`srgn` extends this through premade, language grammar-aware scopes, made possible
through the excellent [`tree-sitter`](https://tree-sitter.github.io/tree-sitter/)
library. It offers a
[queries](https://tree-sitter.github.io/tree-sitter/using-parsers#query-syntax) feature,
which works much like pattern matching against a [tree data
structure](https://en.wikipedia.org/wiki/Parse_tree).
`srgn` comes bundled with a handful of the most useful of these queries. Through its
discoverable API (either [as a library](#rust-library) or via CLI, `srgn --help`), one
can learn of the supported languages and available, premade queries. Each supported
language comes with an escape hatch, allowing you to run your own, custom ad-hoc
queries. The hatch comes in the form of `--lang-query <S EXPRESSION>`, where `lang` is a
language such as `python`. It allows you to write small, ad-hoc linters, for example
to catch code such as
```python cond.py
if x:
return left
else:
return right
```
with an invocation of
```bash
(The code can be rewritten as `return left if x else right`)
The expression required for matching this case is a mouthful. A couple resources exist
for getting started with your own queries:
- the [official docs on
querying](https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries)
- the great [official playground](https://tree-sitter.github.io/tree-sitter/playground)
for interactive use
- [*How to write a linter using tree-sitter in an
hour*](https://siraben.dev/2022/03/22/tree-sitter-linter.html), a great introduction
to the topic in general
- the official [`tree-sitter`
CLI](https://github.com/tree-sitter/tree-sitter/blob/master/cli/README.md)
- using `srgn` with high verbosity (`-vvvv`) is supposed to grant detailed insights into
what's happening to your input, including a [representation of the parsed
tree](https://docs.rs/tree-sitter/latest/tree_sitter/struct.Node.html#method.to_sexp)
> [!NOTE]
>
> Language scopes are applied *first*, so whatever regex aka main scope you pass, it
> operates on each matched language construct individually.
##### Showcases
###### Assigning `TODO`s (TypeScript)
Perhaps you're using a system of `TODO` notes in comments:
```typescript todo.ts
class TODOApp {
// TODO app for writing TODO lists
addTodo(todo: TODO): void {
// TODO: everything, actually 🤷♀️
}
}
```
and *usually* assign people to each note. It's possible to automate assigning yourself
to every unassigned note (lucky you!) using
```bash
which in this case gives
```typescript output-todo.ts
class TODOApp {
// TODO app for writing TODO lists
addTodo(todo: TODO): void {
// TODO(@poorguy): everything, actually 🤷♀️
}
}
```
Notice the [positive lookahead](https://www.regular-expressions.info/lookaround.html) of
`(?=:)`, ensuring an actual `TODO` note is hit (`TODO:`). Otherwise, the other `TODO`s
mentioned around the comments would be matched as well.
###### Converting `print` calls to proper `logging` (Python)
Say there's code making liberal use of `print`:
```python money.py
def print_money():
"""Let's print money 💸."""
amount = 32
print("Got here.")
print_more = lambda s: print(f"Printed {s}")
print_more(23) # print the stuff
print_money()
print("Done.")
```
and a move to [`logging`](https://docs.python.org/3/library/logging.html) is desired.
That's fully automated by a call of
```bash
yielding
```python output-money.py
def print_money():
"""Let's print money 💸."""
amount = 32
logging.info("Got here.")
print_more = lambda s: logging.info(f"Printed {s}")
print_more(23) # print the stuff
print_money()
logging.info("Done.")
```
> [!NOTE]
> Note the [anchors](https://www.regular-expressions.info/anchors.html): `print_more` is
> a function call as well, but `^print$` ensures it's not matched.
>
> The regular expression applies *after* grammar scoping, so operates entirely within
> the already-scoped context.
###### Remove all comments (C#)
Overdone, comments can turn into [smells](https://refactoring.guru/smells/comments). If
not tended to, they might very well start lying:
```csharp UserService.cs
using System.Linq;
public class UserService
{
private readonly AppDbContext _dbContext;
/// <summary>
/// Initializes a new instance of the <see cref="FileService"/> class.
/// </summary>
/// <param name="dbContext">The configuration for manipulating text.</param>
public UserService(AppDbContext dbContext)
{
_dbContext /* the logging context */ = dbContext;
}
/// <summary>
/// Uploads a file to the server.
/// </summary>
// Method to log users out of the system
public void DoWork()
{
_dbContext.Database.EnsureCreated(); // Ensure the database schema is deleted
_dbContext.Users.Add(new User /* the car */ { Name = "Alice" });
/* Begin reading file */
_dbContext.SaveChanges();
var user = _dbContext.Users.Where(/* fetch products */ u => u.Name == "Alice").FirstOrDefault();
/// Delete all records before proceeding
if (user /* the product */ != null)
{
System.Console.WriteLine($"Found user with ID: {user.Id}");
}
}
}
```
So, should you count purging comments among your fetishes, more power to you:
```bash
[different](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/tokens/comments)
[sorts](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/xmldoc/) of
comments were identified and removed. The second pass removes all leftover dangling
lines (`[:blank:]` is [tabs and
spaces](https://docs.rs/regex/latest/regex/#ascii-character-classes)).
> [!NOTE]
> When deleting (`-d`), for reasons of safety and sanity, a scope is *required*.
#### Explicit failure for (mis)matches
After all scopes are applied, it might turn out no matches were found. The default
behavior is to silently succeed:
```console
```
The output matches the specification: all digits are removed. There just happened to be
none. No matter how many actions are applied, **the input is returned unprocessed** once
this situation is detected. Hence, no unnecessary work is done.
One might prefer receiving explicit feedback (exit code other than zero) on failure:
```bash
The inverse scenario is also supported: **failing if anything matched**. This is useful
for checks (for example, in CI) against "undesirable" content. This works much like a
custom, ad-hoc linter.
Take for example "old-style" Python code, where type hints are not yet [surfaced to the
syntax-level](https://docs.python.org/3/library/typing.html):
```python oldtyping.py
def square(a):
"""Squares a number.
:param a: The number (type: int or float)
"""
return a**2
```
This style can be checked against and "forbidden" using:
```bash
#### Literal scope
This causes whatever was passed as the regex scope to be interpreted literally. Useful
for scopes containing lots of special characters that otherwise would need to be
escaped.
## Rust library
While this tool is CLI-first, it is library-very-close-second, and library usage is
treated as a first-class citizen just the same. See the [library
documentation](https://docs.rs/srgn) for more, library-specific details.
### Status and stats
[](https://docs.rs/srgn/)
[](https://codecov.io/gh/alexpovel/srgn)
[](https://crates.io/crates/srgn)
[](https://hitsofcode.com/github/alexpovel/srgn/view?branch=main)
Note: these apply to the entire repository, including the [binary](./src/main.rs).
#### Code coverage icicle graph
The code is currently structured as (color indicates coverage):
[](https://codecov.io/gh/alexpovel/srgn/graphs/icicle.svg?token=IPU7L9BWMV)
Hover over the rectangles for file names.
## Contributing
To see how to build, refer to [compiling from source](#cargo-compile-from-source).
Otherwise, refer to the [guidelines](CONTRIBUTING.md).
## Similar tools
An unordered list of similar tools you might be interested in.
- [Semgrep](https://semgrep.dev/)
- [`sd`](https://github.com/chmln/sd)
- [`ripgrep`](https://github.com/BurntSushi/ripgrep)
- [`ripgrep-structured`](https://github.com/orf/ripgrep-structured)
- [tree-sitter CLI](https://github.com/tree-sitter/tree-sitter/blob/master/cli/README.md)
- [`gron`](https://github.com/tomnomnom/gron)
## Comparison with `tr`
`srgn` is inspired by `tr`, and in its simplest form behaves similarly, but not
identically. In theory, `tr` is quite flexible. In practice, it is commonly used mainly
across a couple specific tasks. Next to its two positional arguments ('arrays of
characters'), one finds four flags:
1. `-c`, `-C`, `--complement`: complement the first array
2. `-d`, `--delete`: delete characters in the first first array
3. `-s`, `--squeeze-repeats`: squeeze repeats of characters in the first array
4. `-t`, `--truncate-set1`: truncate the first array to the length of the second
In `srgn`, these are implemented as follows:
1. is not available directly as an option; instead, negation of regular expression
classes can be used (e.g., `[^a-z]`), to much more potent, flexible and well-known
effect
2. available (via regex)
3. available (via regex)
4. not available: it's inapplicable to regular expressions, not commonly used and, if
used, often misused
To show how uses of `tr` found in the wild can translate to `srgn`, consider the
following section.
### Use cases and equivalences
The following sections are the approximate categories much of `tr` usage falls into.
They were found using [GitHub's code search](https://cs.github.com). The corresponding
queries are given. Results are from the first page of results at the time. The code
samples are links to their respective sources.
As the stdin isn't known (usually dynamic), some representative samples are used and the
tool is exercised on those.
#### Identifier Safety
Making inputs safe for use as identifiers, for example as variable names.
[Query](https://github.com/search?type=code&q=%22tr+-c%22)
1. [`tr -C '[:alnum:]_\n'
'_'`](https://github.com/grafana/grafana/blob/9328fda8ea8384e8cfcf1c78d1fe95d92bbad786/docs/make-docs#L234)
Translates to:
```console
$ echo 'some-variable? 🤔' | srgn '[^[:alnum:]_\n]' '_'
some_variable___
```
Similar examples are:
- [`tr -C "[:alnum:]"
'-'`](https://github.com/elastic/go-elasticsearch/blob/594de0c207ef5c4804615ebedd043a789ef3ce75/.buildkite/functions/imports.sh#L38)
- [`tr -C "A-Za-z0-9"
"_"`](https://github.com/Homebrew/brew/blob/b2cf50bbe10ab996a1e3365545fadabf36df777a/Library/Homebrew/cmd/update.sh#L104)
- [`tr -c '[a-zA-Z0-9-]'
'-'`](https://github.com/xamarin/xamarin-macios/blob/c14f9ff7c7693ab060c4b84c78075ff975ea7c64/Make.config#L69)
- [`tr -C 'a-zA-Z0-9@_'
'_'`](https://github.com/openzfsonwindows/openzfs/blob/61f4ce826122f19a0a0c734efb4c2469b2aa367b/autogen.sh#L22)
2. [`tr -c '[:alnum:]'
_`](https://github.com/freebsd/freebsd-src/blob/9dc0c983b0931f359c2ff10d47ad835ef74e929a/libexec/rc/rc.d/jail#L413)
Translates to:
```console
$ echo 'some variablê' | srgn '[^[:alnum:]]' '_'
some__variabl_
```
3. [`tr -c -s '[:alnum:]'
'-'`](https://github.com/weaviate/weaviate/blob/169381df70852ef687528ebf81e27869b3017403/ci/push_docker.sh#L26)
Translates to:
```console
$ echo '🙂 hellö???' | srgn -s '[^[:alnum:]]' '-'
-hell-
```
#### Literal-to-literal translation
Translates a *single*, literal character to another, for example to clean newlines.
[Query](https://github.com/search?q=%22%7C+tr++%22+%28path%3A*.sh+OR+path%3A*.yml+OR+path%3A*.yaml%29&type=code&ref=advsearch)
1. [`tr " "
";"`](https://github.com/facebook/react-native/blob/d31d16b19cecb893a388fcb141602e8abad4aa76/packages/react-native/sdks/hermes-engine/utils/build-hermes-xcode.sh#L32)
Translates to:
```console
$ echo 'x86_64 arm64 i386' | srgn ' ' ';'
x86_64;arm64;i386
```
Similar examples are:
- [`tr ' '
'-'`](https://github.com/eyedol/tools/blob/e940fe4484b486aa8d42a76d9305a9227bea7552/backup.sh#L11)
- [`tr '-'
'_'`](https://github.com/rerun/rerun/blob/aa5ad6360780ddbbb11835654e8f49b3827f15cd/modules/stubbs/lib/functions.sh#L147)
2. [`tr '.'
"\n"`](https://github.com/SDA-SE/cluster-image-scanner/blob/a769be53eae423f57a7f34c429cfa3a2770a859e/images/scan/syft/build.sh#L16):
Translates to:
```console
$ echo '3.12.1' | srgn --literal-string '.' '\n' # Escape sequence works
3
12
1
$ echo '3.12.1' | srgn '\.' '\n' # Escape regex otherwise
3
12
1
```
3. [`tr '\n'
','`](https://github.com/gtoubassi/dqn-atari/blob/513b307039f4c28b5b517cd542ad625b41f0ef50/logstats.sh#L43)
Translates to:
```console
$ echo -ne 'Some\nMulti\nLine\nText' | srgn --literal-string '\n' ','
Some,Multi,Line,Text
```
If escape sequences remain uninterpreted (`echo -E`, the default), the scope's
escape sequence will need to be turned into a literal `\` and `n` as well, as it is
otherwise interpreted by the tool as a newline:
```console
$ echo -nE 'Some\nMulti\nLine\nText' | srgn --literal-string '\\n' ','
Some,Multi,Line,Text
```
Similar examples are:
- [`tr '\n' '
'`](https://github.com/mhassan2/splunk-n-box/blob/a721af8b8ae6103a7b274651206d4812d37db398/scripts/viz.sh#L427)
- [`tr "\n" "
"`](https://github.com/ministryofjustice/modernisation-platform-configuration-management/blob/6a1dd9f31a62d68d796ae304165eed1fcb1b822e/ansible/roles/nomis-weblogic/tasks/patch-weblogic.yml#L13)
#### Removing a character class
Very useful to remove whole categories in one fell swoop.
[Query](https://github.com/search?q=%22%7C+tr++%22+%28path%3A*.sh+OR+path%3A*.yml+OR+path%3A*.yaml%29&type=code&ref=advsearch)
1. [`tr -d
'[:punct:]'`](https://github.com/CNMAT/OpenSoundControl.org/blob/fb7b3b48ba9ac64eae030e3333f9a980f4f8fd59/build-implementations.sh#L98)
which they [describe
as](https://github.com/CNMAT/OpenSoundControl.org/blob/fb7b3b48ba9ac64eae030e3333f9a980f4f8fd59/build-implementations.sh#L94):
> Omit all punctuation characters
translates to:
```console
$ echo 'Lots... of... punctuation, man.' | srgn -d '[[:punct:]]'
Lots of punctuation man
```
Lots of use cases also call for **inverting**, then removing a character class.
[Query](https://github.com/search?type=code&q=%22tr+-c%22)
1. [`tr
-cd a-z`](https://github.com/git/git/blob/d6c51973e4a0e889d1a426da08f52b9203fa1df2/t/lib-credential.sh#L542)
Translates to:
```console
$ echo 'i RLY love LOWERCASING everything!' | srgn -d '[^[:lower:]]'
iloveeverything
```
2. [`tr -cd
'a-zA-Z0-9'`](https://github.com/gitlabhq/gitlabhq/blob/e74bf51e817ee50e85b1bbdc34f0443d1088fd68/doc/user/project/service_desk/configure.md?plain=1#L553)
Translates to:
```console
$ echo 'All0wed ??? 💥' | srgn -d '[^[:alnum:]]'
All0wed
```
3. [`tr -cd
'[[:digit:]]'`](https://github.com/coredns/coredns/blob/b5e6291115d1e60fed561c64d70341b354e69504/Makefile.release#L94)
Translates to:
```console
$ echo '{"id": 34987, "name": "Harold"}' | srgn -d '[^[:digit:]]'
34987
```
#### Remove literal character(s)
Identical to replacing them with the empty string.
[Query](https://github.com/search?q=%22%7C+tr+%22&type=code&ref=advsearch)
1. [`tr -d "."`](https://github.com/ohmyzsh/ohmyzsh/blob/079dbff2c4f22935a71101c511e2285327d8ab68/themes/gallois.zsh-theme#L82)
Translates to:
```console
$ echo '1632485561.123456' | srgn -d '\.' # Unix timestamp
1632485561123456
```
Similar examples are:
- [`tr -d '\`'`](https://github.com/jlevy/the-art-of-command-line/blob/6b50745d2e788add2e8f1ed29010e72659a9a074/README.md?plain=1#L22)
- [`tr -d ' '`](https://github.com/hfg-gmuend/openmoji/blob/9782be9d240513a3d609a4bd6f1176f2d7e1b804/helpers/lib/optimize-build.sh#L77)
- [`tr -d ' '`](https://github.com/PaNOSC-ViNYL/SimEx/blob/0ca295ec57864c0e468eba849d3f44f992c59634/Docker/simex_devel/simex_install.sh#L44)
2. [`tr -d '\r\n'`](https://github.com/kubernetes/kubernetes/blob/37cf2638c975080232990d2fc2c24d0f40c38074/cluster/gce/util.sh#L1690)
Translates to:
```console
$ echo -e 'DOS-Style\r\n\r\nLines' | srgn -d '\r\n'
DOS-StyleLines
```
Similar examples are:
- [`tr -d '\r'`](https://github.com/mhassan2/splunk-n-box/blob/a721af8b8ae6103a7b274651206d4812d37db398/scripts/viz.sh#L427)
- [`tr -d '\r'`](https://github.com/Cidaas/cidaas-shopware-connect-plugin/blob/519e21a9a385b26803ec442e2a8b59918a948f77/.gitlab-ci.yml#L43)
#### Squeeze whitespace
Remove repeated whitespace, as it often occurs when slicing and dicing text.
[Query](https://github.com/search?type=code&q=%22+tr+-s%22)
1. [`tr -s '[:space:]'`](https://github.com/ohmyzsh/ohmyzsh/blob/bbda81fe4b338f00bbf7c7f33e6d1b12d067dc05/plugins/alias-finder/alias-finder.plugin.zsh#L26)
Translates to:
```console
$ echo 'Lots of space !' | srgn -s '[[:space:]]' # Single space stays
Lots of space !
```
Similar examples are:
- [`tr -s " "`](https://github.com/facebookresearch/fastText/blob/166ce2c71a497ff81cb62ec151be5b569e1f1be6/.circleci/pull_data.sh#L18)
- [`tr -s
[:blank:]`](https://github.com/Atmosphere-NX/Atmosphere/blob/4fe9a89ab8ed958a3e080d7ee11767bef9cb2d57/atmosphere.mk#L11)
(`blank` is `\t` and space)
- [`tr
-s`](https://github.com/kovidgoyal/kitty/blob/863adb3e8d8e7229610c6b0e6bc8d48db9becda5/kitty/rc/get_colors.py#L28)
(no argument: this will error out; presumably space was meant)
- [`tr -s ' '`](https://github.com/google/jax/blob/bf40f75bd59501d66a0e500d255daca3f9f2895e/build/rocm/build_rocm.sh#L65)
- [`tr -s ' '`](https://github.com/chromium/chromium/blob/ffed2601d91f4413ca4672a6027c2c05d49df815/docs/linux/minidump_to_core.md?plain=1#L111)
- [`tr -s '[:space:]'`](https://github.com/ceph/ceph/blob/6c387554d8e104727b3e448a1def4f1991be1ff7/src/stop.sh#L188)
- [`tr -s ' '`](https://github.com/PyO3/pyo3/blob/8f4a26a66ecee4cfa473ada8cb27a57c4533d04f/.netlify/build.sh#L7)
2. [`tr -s ' '
'\n'`](https://github.com/doocs/leetcode/blob/4f89e08ed45f7d5e1047767071001073ffe4d32b/solution/0100-0199/0192.Word%20Frequency/README.md?plain=1#L54)
(squeeze, *then replace*)
Translates to:
```console
$ echo '1969-12-28 13:37:45Z' | srgn -s ' ' 'T' # ISO8601
1969-12-28T13:37:45Z
```
3. [`tr -s '[:blank:]' ':'`](https://github.com/cockroachdb/cockroach/blob/985662236d7bf273b93a7b5e32def8e2d1043640/docs/generated/http/BUILD.bazel#L76)
Translates to:
```console
$ echo -e '/usr/local/sbin \t /usr/local/bin' | srgn -s '[[:blank:]]' ':'
/usr/local/sbin:/usr/local/bin
```
#### Changing character casing
A straightforward use case. Upper- and lowercase are often used.
[Query](https://github.com/search?q=%22%7C+tr++%22+%28path%3A*.sh+OR+path%3A*.yml+OR+path%3A*.yaml%29&type=code&ref=advsearch)
1. [`tr A-Z
a-z`](https://github.com/golang/go/blob/a742ae493ff59a71131706500ce53f85477897f0/src/encoding/xml/xml.go#L1874)
(lowercasing)
Translates to:
```console
$ echo 'WHY ARE WE YELLING?' | srgn --lower
why are we yelling?
```
Notice the default scope. It can be refined to lowercase only long words, for
example:
```console
$ echo 'WHY ARE WE YELLING?' | srgn --lower '\b\w{,3}\b'
why are we YELLING?
```
Similar examples are:
- [`tr 'A-Z' 'a-z'`](https://github.com/nwchemgit/nwchem/blob/aad4ecd5657055b085b57115314e4d56271ad749/travis/guess_simd.sh#L13)
- [`tr '[A-Z]' '[a-z]'`](https://github.com/XIMDEX/xcms/blob/4dd3c055de5cb0eebed28f1e9da87ed731a44a99/bin/lib/util.sh#L47)
- [`tr '[A-Z]' '[a-z]'`](https://github.com/varunjampani/video_prop_networks/blob/4f4a39842bd9112932abe40bad746c174a242bf6/lib/davis/configure.sh#L30)
- [`tr '[:upper:]' '[:lower:]'`](https://github.com/PaNOSC-ViNYL/SimEx/blob/0ca295ec57864c0e468eba849d3f44f992c59634/Docker/simex_devel/simex_install.sh#L44)
- [`tr "[:upper:]" "[:lower:]"`](https://github.com/tst-labs/esocial/blob/b678f59bba883a63e91be79d7f2853a57156cf7b/src/esocial-esquemas/generate-java-from-xsd.sh#L11)
2. [`tr '[a-z]'
'[A-Z]'`](https://github.com/henrikpersson/potatis/blob/63feb9de28781e4e9c62bd091bd335b87b474cb1/nes-android/install.sh#L10)
(uppercasing)
Translates to:
```console
$ echo 'why are we not yelling?' | srgn --upper
WHY ARE WE NOT YELLING?
```
Similar examples are:
- [`tr '[a-z]' '[A-Z]'`](https://github.com/basho/riak-zabbix/blob/423e21c31821a345bf59ec4b2baba06d532a7f30/build_templates.sh#L40)
- [`tr "[:lower:]" "[:upper:]"`](https://github.com/Fivium/Oracle-Backup-and-Sync/blob/036aace4a8eb45ab7e6e226ddceb08f35c46b9f3/dbsync/scripts/dbsync.sh#L122)
- [`tr "[:lower:]" "[:upper:]"`](https://github.com/jorgeazevedo/xenomai-lab/blob/a2ce85a86f37fd9762905026ce4a1542684c714b/data/.xenomailab/blocks/template/rename.sh#L5)
[tr]: https://www.gnu.org/software/coreutils/manual/html_node/tr-invocation.html