audioscan
Decode an audio file once and report its format, EBU R128 loudness, and silence
windows as JSON. One fast native pass instead of two or three ffmpeg shellouts.
Why
I mix and master music, and a private catalog of mine needs two boring things
from every recording: how loud it is (so tracks sit at a consistent volume) and
where the silent gaps are (so it can split a long recording into songs). The
first version got both by running ffmpeg, the standard audio command-line tool,
and reading the numbers out of its status text. That fully decodes the file once
per measurement and is fragile: it already cost a real bug, reading ffmpeg's first
per-frame I: -70 line instead of the final Summary block, storing a loudness off
by tens of decibels. audioscan decodes each file a single time with symphonia,
measures loudness with the real ebur128 library (the same math ffmpeg uses, on
the EBU R128 standard that streaming services use to keep volume consistent), finds
silence in the same pass, and prints structured JSON. Same numbers, fewer decodes,
nothing to scrape.
Install
That installs the audioscan binary from crates.io. To install the latest from
source without cloning, use cargo install --git https://github.com/KiwiMaddog2020/audioscan.
Prebuilt macOS and Linux binaries are attached to each
release.
Build
Use
--prettypretty-printed JSON (default)--compactone-line JSON--strictfail instead of returningstatus: "partial"when decode is incomplete--timeoutper-file soft decode deadline in seconds (default: none / unbounded)--thresholdsilence threshold in RMS dBFS (default -30)--min-gapshortest silence to report, in seconds (default 5.0)
--timeout <secs> bounds how long a single file may spend decoding. It is a
cooperative soft deadline checked between packets, so a slow or wedged file
stops at the limit instead of running unbounded. A timed-out file is reported as
status: "partial" with a decode exceeded timeout of <N>s warning, or, under
--strict, an error. The default is no timeout, so legitimately long recordings
are never truncated unless you set one. In batch mode the deadline applies per
file and the batch continues past a timed-out file.
On success, single-file mode prints audioscan: analyzed <path> in <N.NN>s
to stderr, so the JSON on stdout stays clean and pipeable.
Batch
Batch mode recursively scans known audio extensions under <dir> and emits
compact JSON Lines, one row per file. Without --out, rows are written to
stdout. --jobs auto uses rayon's default worker count; --jobs <N> pins the
batch to a fixed positive worker count.
Each batch JSONL row, success or error, also includes "bytes": <input file size in bytes on disk>, a deterministic per-row field for sorting or spotting
large inputs. Successful rows contain the analysis object shown below plus
bytes. Per-file failures are written as
{"schema_version":1,"path":"...","error":"...","bytes":1234}. bytes is a
batch-row-only operational field; the single-file output object below does not
include it. Each file is isolated with panic capture, so a panic or decode
failure for one recording becomes an error row instead of aborting the batch.
Batch mode prints a live per-file progress line to stderr as each file completes, followed by the summary and slowest-file timing report:
audioscan: [3/2000] /archive/take_03.wav (1182ms)
audioscan: scanned 2000 file(s): 1996 ok, 3 partial, 1 failed in 41.7s
audioscan: slowest: big.flac 3201ms (118.0 MB), long.wav 1980ms (90.2 MB), take_03.wav 1182ms (44.1 MB)
Because the breadcrumb streams as files finish, not just at the end, a wedged or
slow file is visible live as the file with no completion line yet, and the run
is not silent until the end.
The slowest: line lists the slowest files with each file's elapsed time in
milliseconds and size.
stdout JSON Lines stay byte-identical across --jobs counts, so per-file
wall-clock timing and progress live on stderr instead of in the JSONL stream.
Exit codes are 0 when the command completes and writes its requested output,
1 for fatal runtime failures such as unreadable output paths, no discovered
audio files, or a failed single-file scan, and 2 for usage or invalid-config
errors. Batch per-file error rows do not by themselves make the batch command
fail once the JSONL output has been written.
Output
status is ok for a clean decode and partial when the scan completed after
skipping corrupt packets, detecting an incomplete stream, or exceeding a
configured timeout. warnings[] holds human-readable diagnostics for partial
output; it is empty for clean output.
With --strict, partial decodes become errors instead of JSON analysis rows.
container is the lowercased file extension from the input path, or "" for an
extensionless path.
integrated_lufs and loudness_range_lu are null together when the input is
too short or quiet to measure. true_peak_dbtp is null only for digital
silence, where there is no inter-sample peak to report. silences uses a simple
[start, end] seconds convention. Silence boundaries are quantized to the roughly 30 ms analysis window,
matching ffmpeg silencedetect.
Validation
Checked against ffmpeg's ebur128 filter on generated signals:
| signal | metric | audioscan | ffmpeg |
|---|---|---|---|
| 1 kHz @ -3 dBFS + 6 s silence | integrated | -6.26 LUFS | -6.3 LUFS |
| true peak | -3.0 dBTP | (-3 dBFS sine) | |
| silence | [6.0, 12.0] | (built at 6-12 s) | |
| varied -6/-18/-3/-14/-9 dBFS | integrated | -9.46 LUFS | -9.5 LUFS |
| loudness range | 11.0 LU | 11.0 LU |
Reproduce:
Note: LRA only agrees on signals with real loudness variation. On a degenerate two-level signal the percentile gating is unstable in both tools and they disagree, which is expected, not a bug.
Formats
Enabled: wav, flac, mp3, aac/m4a, ogg/vorbis, adpcm, mkv (symphonia defaults plus
mp3, aac, isomp4). Not yet enabled: aiff, alac, opus. Add the feature in
Cargo.toml when a recording needs it.
Status and next steps
Standalone by design, intentionally not yet wired into the catalog it was built
for. Swapping a production pipeline's measurement path for an audioscan
subprocess is a separate, careful change. Because the contract is "run a binary,
read JSON," that swap stays clean when I make it.
Candidate directions:
- a C interface so a Swift app can call the same core directly, with no subprocess
- bump
symphoniato 0.6