squashfs-async 0.1.1

Parsing and reading of [SquashFS](https://en.wikipedia.org/wiki/SquashFS) archives, on top of any
implementor of the [`tokio::io::AsyncRead`] and [`tokio::io::AsyncSeek`] traits.

More precisely, this crate provides:

- A [`SquashFs`] structure to read SquashFS archives on top of any asynchronous reader.
- An implementation of [`fuser_async::Filesystem`] on [`SquashFs`], allowing to easily build [FUSE](https://en.wikipedia.org/wiki/Filesystem_in_Userspace) filesystems using SquashFS archives.
- A `squashfuse-rs` binary for mounting SquashFS images via FUSE, with async IO and multithreaded decompression.

## Motivation: multithreaded/async SquashFS reading

The main motivation was to provide a [`squashfuse`](https://github.com/vasi/squashfuse/pull/70#issuecomment-1249788158) implementation that could:

- Decompress blocks in parallel.
- Benefit from async I/O when relevant (mostly with the case of a networked backend in mind), with easy integration with [`tokio::io`].

To the author's understanding, `squashfuse` [uses a single-threaded FUSE loop](https://github.com/hpc/charliecloud/issues/1157), and while the kernel driver does multithreaded decompression (when compiled with this option), it doesn't support parallel reads. Note that a [patch](https://github.com/vasi/squashfuse/pull/70) exists to add multi-threading to the low-level squashfuse see [Benchmarks](#Benchmarks).

### Example: squashfs-on-S3

This crate has been used to implement a FUSE filesystem providing transparent access to squashfs images hosted on an S3 API, using the S3 example in [`fuser_async`]. With a local [MinIO](https://github.com/minio/minio) server, throughput of 365 MB/s (resp. 680 MB/s) are achieved for sequential (resp. parallel) access to zstd1-compressed images with 20 MB files.

## `squashfuse-rs` binary

The `squashfuse-rs` binary is an example that implements an analogue to `squashfuse` using this crate, allowing to mount squashfs images via FUSE.

```console
$ squashfuse-rs --help
USAGE:
squashfuse-rs [OPTIONS] <INPUT> <MOUNTPOINT>

ARGS:
   <INPUT>         Input squashfs image
   <MOUNTPOINT>    Mountpoint

OPTIONS:
       --backend <BACKEND>              [default: memmap] [possible values: tokio, async-fs, memmap]
       --cache-mb <CACHE_MB>            Cache size (MB) [default: 100]
   -d, --debug
       --direct-limit <DIRECT_LIMIT>    Limit (B) for fetching small files with direct access [default: 0]
   -h, --help                           Print help information
       --readers <READERS>              Number of readers [default: 4]

```

## Benchmarks

The following benchmarks (see `tests/`) compute the mean and standard deviation of 10 runs, dropping caches after each run, with the following variations:

- Sequential or parallel (with 4 threads) read.
- Compressed archive (gzip and zstd1) or not.
- Different backends for reading the underlying file in `squashfuse-rs`.
- The archives are either:
  - Case A: Containing sixteen random files of 20 MB each, generated by `tests/testdata.rs` (note that given that the files are random, the zstd compression has minimal effect on the data blocks).
  - Case B: Containing three hundred 20 MB images (with a compression ratio of 1.1 with zstd-1).

Entries are normalized by `(case, comp)` pair (i.e. pairs of rows) with respect to the duration of the sequential `squashfuse` run. Number smaller than 1 indicate faster results than this baseline. The last 3 columns are `squashfuse-rs` with different backends ([`MemMap`](`pools::LocalReadersPoolMemMap`) being the most performant).

| Case |     | Comp. | `squashfuse` | [`squashfuse_ll_mt`](https://github.com/vasi/squashfuse/pull/70) | [`MemMap`](`pools::LocalReadersPoolMemMap`) | [`Tokio`](`pools::LocalReadersPool`) | [`AsyncFs`](`pools::LocalReadersPoolAsyncFs`) |
| ---- | --- | ----- | ------------ | ---------------------------------------------------------------- | ------------------------------------------- | ------------------------------------ | --------------------------------------------- |
| A    | Seq | -     | 1            | 1.16                                                             | 1.01                                        | 1.93                                 | 1.56                                          |
|      | Par | -     | 1.8          | 0.5                                                              | 0.54                                        | 0.8                                  | 0.76                                          |
|      | Seq | gzip  | 1            | 0.92                                                             | 0.94                                        | 1.79                                 | 1.48                                          |
|      | Par | gzip  | 2.07         | 0.46                                                             | 0.51                                        | 0.75                                 | 0.71                                          |
|      | Seq | zstd1 | 1            | 0.96                                                             | 1.04                                        | 1.78                                 | 1.47                                          |
|      | Par | zstd1 | 2.35         | 0.48                                                             | 0.51                                        | 0.76                                 | 0.71                                          |
| B    | Seq | -     | 1            | 0.89                                                             | 0.93                                        | 2.08                                 | 1.43                                          |
|      | Par | -     | 1.6          | 0.54                                                             | 0.6                                         | 0.89                                 | 0.91                                          |
|      | Seq | zstd1 | 1            | 0.59                                                             | 0.65                                        | 0.98                                 | 0.87                                          |
|      | Par | zstd1 | 1.07         | 0.3                                                              | 0.35                                        | 0.3                                  | 0.54                                          |

<small>Smaller numbers are better; numbers smaller than 1 denote an improvement over the baseline</small>

> [!WARNING]  
> These should be updated with the latest versions of the code and of `squashfuse`.

To execute the tests (case A), `cargo` needs to run with root privileges to be able to clear caches between runs, e.g.

```console
$ N_RUNS=10 CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_RUNNER='sudo -E' cargo test -r --test main -- --nocapture
```

## Differences with similar crates

- [`squashfs`](https://crates.io/crates/squashfs) is a work in progress that only supports parsing some structures (superblock, fragment table, uid/gid table).
- [`backhand`](https://crates.io/crates/backhand) and this crate were implemented indendently at roughly the same time. Some differences are (see also [Limitations](#Limitations) below):
  - The primary goal of this crate was to allow mounting squashfs images with FUSE, with async IO and multithreaded decompression. `backend` uses a synchronous [`std::io::Read`]/[`std::io::Seek`] backend, while this crate uses a [`tokio::io::AsyncRead`]/[`tokio::io::AsyncSeek`] backend.
  - This crates provide caching for decompressed blocks.
  - `backhand` supports write operations, while `squashfs-async` doesn't.
- [`squashfs-ng-rs`](https://crates.io/crates/squashfs-ng) wraps the C API, while this crate is a pure Rust implementation.

## References

- <https://dr-emann.github.io/squashfs/>
- <https://dr-emann.github.io/squashfs/squashfs.html>

## Limitations/TODOs

- For now, only file and directory inodes are supported.
- The tables are loaded into memory on initial parsing for caching, rather than being accessed lazily.
- ...