libmagic-rs
A pure-Rust implementation of libmagic, the library that powers the file command for identifying file types. This project provides a memory-safe, efficient alternative to the C-based libmagic library.
[!NOTE] This is a clean-room implementation inspired by the original libmagic project. We respect and acknowledge the original work by Ian Darwin and the current maintainers led by Christos Zoulas.
Project Status
v0.5.0 -- The core file identification pipeline is functional. Common file types can be identified using text magic files today.
[!WARNING] Pre-1.0 API. libmagic-rs is a pre-1.0 crate and the public API may change between minor versions until v1.0.0 is cut. Pin an exact version in
Cargo.tomlif you need reproducible builds, and readCHANGELOG.mdbefore upgrading. See issue #52 for the v1.0 stability roadmap.
- 1,200+ tests with >94% line coverage
- Zero unsafe code (
unsafe_code = "forbid"enforced project-wide) - Zero warnings with strict clippy linting
- Published on crates.io
Features
- Parse and evaluate text magic files (the stable, documented format)
- Identify files via CLI (
rmagic) or as a library dependency - Text and JSON output formats
- Built-in fallback rules for 10 common formats (ELF, PE, ZIP, TAR, GZIP, JPEG, PNG, GIF, BMP, PDF)
- Custom magic files via
--magic-file - Memory-mapped I/O with bounds checking
- Hierarchical rule evaluation with confidence scoring
- Stdin support (
rmagic -)
Supported Magic File Syntax
| Category | Supported |
|---|---|
| Types | byte, short, long, quad, float, double, string, pstring (with big/little-endian variants), unsigned variants (ubyte, ushort/ubeshort/uleshort, ulong/ubelong/ulelong, uquad/ubequad/ulequad), 32-bit dates (date/ldate/bedate/beldate/ledate/leldate), 64-bit dates (qdate/qldate/beqdate/beqldate/leqdate/leqldate), regex, and search/N |
| Regex | Binary-safe via regex::bytes::Regex. Flags: /c (case-insensitive), /s (match-start anchor advance), /l (line-based scan window). Counts: regex/N (N bytes), regex/Nl (N lines). All variants capped at 8192 bytes (FILE_REGEX_MAX). Compile size is clamped to 1 MiB (size_limit + dfa_size_limit) to bound compile-time DoS exposure from adversarial patterns. |
| Search | Bounded literal scan via memchr::memmem::find. search/N scans the first N bytes from the offset; the range is mandatory (NonZeroUsize). Match-end anchor advance for relative-offset children (matches GNU file semantics). |
| Operators | =, !=, <, >, <=, >=, & (bitwise AND with optional mask), ^ (bitwise XOR), ~ (bitwise NOT), x (any value) |
| Offsets | Absolute, from-end, indirect, and relative (all fully evaluated; magic-file &+N/&-N parsing for relative is pending) |
| Directives | !:strength (parsed; !:mime, !:ext, !:apple planned) |
Quick Start
Installation
CLI Usage
# Basic file identification
# JSON output
# Use built-in rules (no external magic file needed)
# Custom magic file
# Multiple files
# Read from stdin
|
Library Usage
use MagicDatabase;
// Load magic rules from a text magic file
let db = load_from_file?;
// Identify file type
let result = db.evaluate_file?;
println!;
println!;
// Or evaluate an in-memory buffer
let buffer = read?;
let result = db.evaluate_buffer?;
if let Some = result.mime_type
// Or use built-in rules (no external files needed)
let db = with_builtin_rules;
let result = db.evaluate_file?;
Architecture
Magic File --> Parser --> AST --> Evaluator --> Match Results --> Output Formatter
|
Target File --> Memory Mapper --> File Buffer
| Module | Purpose |
|---|---|
parser/ |
Magic file DSL parsing into AST (nom-based) |
evaluator/ |
Rule evaluation with offset resolution, type interpretation, operator matching |
output/ |
Text (GNU file compatible) and JSON formatting |
io/ |
Memory-mapped file buffers with safe bounds checking |
Key Types
Compatibility
libmagic-rs follows the OpenBSD approach: parse text magic files directly, prioritizing simplicity and correctness. Text magic files are stable across libmagic versions and work unchanged from system installations (/usr/share/misc/magic).
Compatibility is validated against the original file project test suite.
Security
- Memory Safety:
unsafe_code = "forbid"enforced project-wide - Bounds Checking: All buffer access protected
- Resource Limits: Configurable recursion depth, string length, and per-file timeout
- Fuzzing: Robustness testing with malformed inputs
Verifying Releases
All release artifacts are signed via Sigstore using GitHub Attestations:
See the release verification guide for details.
Roadmap
See ROADMAP.md for the full roadmap, or GitHub Milestones for issue tracking.
| Milestone | Status | Focus |
|---|---|---|
| v0.2.0 | shipped | Comparison operators, bitwise XOR/NOT, indirect/relative offsets, 64-bit integers |
| v0.3.0 | shipped | Regex, float/double, date/timestamp, pascal strings, meta-types |
| v0.4.0 | shipped | Evaluator submodule split, JSON metadata, parse warnings, improved errors |
| v0.5.x (current) | in flight | TOCTOU/search-path hardening, regex compile cache, validated constructors |
| v0.6.0 | planned | Value pattern refactor, MagicDatabase builder, Directive extension point |
| v1.0.0 | planned | 95%+ GNU file compatibility, stable API, fuzzing harness, full non_exhaustive |
Contributing
See CONTRIBUTING.md for development setup, coding guidelines, and submission process.
License
Licensed under the Apache License 2.0 - see LICENSE for details.
Support
Acknowledgments
- Ian Darwin for the original file command and libmagic
- Christos Zoulas and the current libmagic maintainers
- The Rust community for excellent tooling and ecosystem