Crate magic

source ·
Expand description

About

This crate provides bindings for the libmagic C library, which recognizes the type of data contained in a file (or buffer).

You might be familiar with libmagic’s command-line-interface; file:

$ file data/tests/rust-logo-128x128-blk.png
data/tests/rust-logo-128x128-blk.png: PNG image data, 128 x 128, 8-bit/color RGBA, non-interlaced

libmagic

Understanding how the libmagic C library and thus this Rust crate works requires a bit of glossary.

libmagic at its core can analyze a file or buffer and return a mostly unstructured text that describes the analysis result. There are built-in tests for special cases such as symlinks and compressed files and there are magic databases with signatures which can be supplied by the user for the generic cases (“if those bytes look like this, it’s a PNG image file”).

The analysis behaviour can be influenced by so-called flags and parameters. Flags are either set or unset and do not have a value, parameters have a value.

Databases can be in text form or compiled binary form for faster access. They can be loaded from files on disk or from in-memory buffers. A regular libmagic / file installation contains a default database file that includes a plethora of file formats.

Most libmagic functionality requires a configured instance which is called a “magic cookie”. Creating a cookie instance requires initial flags and usually loaded databases.

Usage example

// open a new configuration with flags
let cookie = magic::Cookie::open(magic::cookie::Flags::ERROR)?;

// load a specific database
// (so exact test text assertion below works regardless of the system's default database version)
let database = ["data/tests/db-images-png"].try_into()?;
// you can instead load the default database
//let database = Default::default();

let cookie = cookie.load(&database)?;

// analyze a test file
let file_to_analyze = "data/tests/rust-logo-128x128-blk.png";
let expected_analysis_result = "PNG image data, 128 x 128, 8-bit/color RGBA, non-interlaced";
assert_eq!(cookie.file(file_to_analyze)?, expected_analysis_result);

See further examples in examples/.

MIME example

Return a MIME type with “charset” encoding parameter:

$ file --mime data/tests/rust-logo-128x128-blk.png
data/tests/rust-logo-128x128-blk.png: image/png; charset=binary
// open a new configuration with flags for mime type and encoding
let flags = magic::cookie::Flags::MIME_TYPE | magic::cookie::Flags::MIME_ENCODING;
let cookie = magic::Cookie::open(flags)?;

// load a specific database
let database = ["data/tests/db-images-png"].try_into()?;
let cookie = cookie.load(&database)?;

// analyze a test file
let file_to_analyze = "data/tests/rust-logo-128x128-blk.png";
let expected_analysis_result = "image/png; charset=binary";
assert_eq!(cookie.file(file_to_analyze)?, expected_analysis_result);

See magic::cookie::Flags::MIME.

Filename extensions example

Return slash-separated filename extensions (the “.png” in “example.png”) from file contents (the input filename is not used for detection):

$ file --extension data/tests/rust-logo-128x128-blk.png
data/tests/rust-logo-128x128-blk.png: png
// open a new configuration with flags for filename extension
let flags = magic::cookie::Flags::EXTENSION;
let cookie = magic::Cookie::open(flags)?;

// load a specific database
let database = ["data/tests/db-images-png"].try_into()?;
let cookie = cookie.load(&database)?;

// analyze a test file
let file_to_analyze = "data/tests/rust-logo-128x128-blk.png";
let expected_analysis_result = "png";
assert_eq!(cookie.file(file_to_analyze)?, expected_analysis_result);

See magic::cookie::Flags::EXTENSION.

Further reading

Note that while some libmagic functions return somewhat structured text, e.g. MIME types and file extensions, the magic crate does not attempt to parse them into Rust data types since the format is not guaranteed by the C FFI API.

Check the crate README for required dependencies and MSRV.

Safety

This crate is a binding to the libmagic C library and as such subject to its security problems. Please note that libmagic has several CVEs, listed on e.g. Repology. Make sure that you are using an up-to-date version of libmagic and ideally add additional security layers such as sandboxing (which this crate does not provide) and do not use it on untrusted input e.g. from users on the internet!

The Rust code of this crate needs to use some unsafe for interacting with the libmagic C FFI.

This crate has not been audited nor is it ready for production use.

This Rust project / crate is not affiliated with the original file / libmagic C project.

Use cases

libmagic can help to identify unknown content. It does this by looking at byte patterns, among other things. This does not guarantee that e.g. a file which is detected as a PNG image is indeed a valid PNG image.

Maybe you just want a mapping from file name extensions to MIME types instead, e.g. “.png” ↔ “image/png”? In this case you do not even need to look at file contents and could use e.g. the mime_guess crate.

Maybe you want to be certain that a file is valid for a kown format, e.g. a PNG image? In this case you should use a parser for that format specifically, e.g. the image crate.

Maybe you want to know if a file contains other, malicious content? In this case you should use an anti-virus software, e.g. ClamAV, Virus Total.

Re-exports

  • pub use crate::cookie::Cookie;

Modules

Functions

  • Returns the version of the libmagic C library as reported by itself.