Expand description
Magic file parser module
This module handles parsing of magic files into an Abstract Syntax Tree (AST) that can be evaluated against file buffers for type identification.
§Overview
The parser implements a complete pipeline for transforming magic file text into a hierarchical rule structure suitable for evaluation. The pipeline consists of:
- Preprocessing: Line handling, comment removal, continuation processing
- Parsing: Individual magic rule parsing using nom combinators
- Hierarchy Building: Constructing parent-child relationships based on indentation
- Validation: Type checking and offset resolution
§Format Detection and Loading
The module automatically detects and handles three types of magic file formats:
- Text files: Human-readable magic rule definitions
- Directories: Collections of magic files (Magdir pattern)
- Binary files: Compiled .mgc files (currently unsupported)
§Unified Loading API
The recommended entry point for loading magic files is load_magic_file(), which
automatically detects the format and dispatches to the appropriate handler:
use libmagic_rs::parser::load_magic_file;
use std::path::Path;
// Works with text files
let rules = load_magic_file(Path::new("/usr/share/misc/magic"))?;
// Also works with directories
let rules = load_magic_file(Path::new("/usr/share/misc/magic.d"))?;
// Binary .mgc files return an error with guidance
match load_magic_file(Path::new("/usr/share/misc/magic.mgc")) {
Ok(rules) => { /* ... */ },
Err(e) => eprintln!("Use --use-builtin for binary files: {}", e),
}§Three-Tier Loading Strategy
The loading process works as follows:
- Format Detection:
detect_format()examines the path to determine the file type - Dispatch to Handler:
- Text files ->
parse_text_magic_file()after reading contents - Directories ->
load_magic_directory()to load and merge all files - Binary files -> Returns error suggesting
--use-builtinoption
- Text files ->
- Return Merged Rules: All rules are returned in a single
Vec<MagicRule>
§Examples
§Loading Magic Files (Recommended)
Use the unified load_magic_file() API for automatic format detection:
use libmagic_rs::parser::load_magic_file;
use std::path::Path;
let rules = load_magic_file(Path::new("/usr/share/misc/magic"))?;
println!("Loaded {} magic rules", rules.len());§Parsing Text Content Directly
For parsing magic rule text that’s already in memory:
use libmagic_rs::parser::parse_text_magic_file;
let magic_content = r#"
0 string \x7fELF ELF executable
>4 byte 1 32-bit
>4 byte 2 64-bit
"#;
let rules = parse_text_magic_file(magic_content)?;
assert_eq!(rules.len(), 1);
assert_eq!(rules[0].children.len(), 2);§Loading a Directory Explicitly
For Magdir-style directories containing multiple magic files:
use libmagic_rs::parser::load_magic_directory;
use std::path::Path;
// Directory structure:
// /usr/share/file/magic.d/
// ├── elf
// ├── archive
// └── text
let rules = load_magic_directory(Path::new("/usr/share/file/magic.d"))?;
// Rules from all files are merged in alphabetical order by filename§Migration Note
For users upgrading from direct function calls:
- Old approach: Call
detect_format()then dispatch manually - New approach: Use
load_magic_file()for automatic dispatching
The individual functions (parse_text_magic_file(), load_magic_directory())
remain available for advanced use cases where you need direct control.
Key differences:
load_magic_file(): Unified API with automatic format detection (recommended)parse_text_magic_file(): Parses a single text string containing magic rulesload_magic_directory(): Loads and merges all magic files from a directorydetect_format(): Low-level format detection (now called internally byload_magic_file())
Error handling in load_magic_directory():
- Critical errors (I/O failures, invalid UTF-8): Returns
ParseErrorimmediately - Non-critical errors (parse failures in individual files): Logs warning to stderr and continues
Re-exports§
pub use ast::Endianness;pub use ast::MagicRule;pub use ast::OffsetSpec;pub use ast::Operator;pub use ast::StrengthModifier;pub use ast::TypeKind;pub use ast::Value;pub use grammar::parse_number;pub use grammar::parse_offset;
Modules§
- ast
- Abstract Syntax Tree definitions for magic rules
- grammar
- Grammar parsing for magic files using nom parser combinators
- types
- Type keyword parsing for magic file types
Enums§
- Magic
File Format - Represents the format of a magic file or directory
Functions§
- detect_
format - Detect the format of a magic file or directory
- load_
magic_ directory - Loads and parses all magic files from a directory, merging them into a single rule set.
- load_
magic_ file - Loads magic rules from a file or directory, automatically detecting the format.
- parse_
text_ magic_ file - Parses a complete magic file from raw text input.