Expand description
§file-identify
A Rust library for identifying file types based on extensions, content, and shebangs.
This library provides a comprehensive way to identify files by analyzing:
- File extensions and special filenames
- File content (binary vs text detection)
- Shebang lines for executable scripts
- File system metadata (permissions, file type)
§Quick Start
use file_identify::{tags_from_path, tags_from_filename, FileIdentifier};
// Simple filename identification
let tags = tags_from_filename("script.py");
assert!(tags.contains("python"));
assert!(tags.contains("text"));
// Full file identification from filesystem path
let tags = tags_from_path(&file_path).unwrap();
assert!(tags.contains("file"));
assert!(tags.contains("python"));
// Customized identification with builder pattern
let identifier = FileIdentifier::new()
.skip_content_analysis() // Skip text vs binary detection
.skip_shebang_analysis(); // Skip shebang parsing
let tags = identifier.identify(&file_path).unwrap();
assert!(tags.contains("file"));
assert!(tags.contains("python"));
§Tag System
Files are identified using a set of standardized tags:
- Type tags:
file
,directory
,symlink
,socket
- Mode tags:
executable
,non-executable
- Encoding tags:
text
,binary
- Language/format tags:
python
,javascript
,json
,xml
, etc.
§Error Handling
Functions that access the filesystem return Result
types. The main error
conditions are:
IdentifyError::PathNotFound
- when the specified path doesn’t existIdentifyError::IoError
- for other I/O related errors
Modules§
Structs§
- File
Identifier - Configuration for file identification behavior.
- Shebang
Tuple - A tuple-like immutable container for shebang components that matches Python’s tuple behavior.
Enums§
- Identify
Error - Errors that can occur during file identification.
Functions§
- file_
is_ text - Determine if a file contains text or binary data.
- is_text
- Determine if data from a reader contains text or binary content.
- parse_
shebang - Parse a shebang line from a reader and return raw shebang components.
- parse_
shebang_ from_ file - Parse shebang line from an executable file and return raw shebang components.
- tags_
from_ filename - Identify a file based only on its filename.
- tags_
from_ interpreter - Identify tags based on a shebang interpreter.
- tags_
from_ path - Identify a file from its filesystem path.
Type Aliases§
- Result
- Result type for file identification operations.