Crate file_identify

Source
Expand description

§file-identify

A Rust library for identifying file types based on extensions, content, and shebangs.

This library provides a comprehensive way to identify files by analyzing:

  • File extensions and special filenames
  • File content (binary vs text detection)
  • Shebang lines for executable scripts
  • File system metadata (permissions, file type)

§Quick Start

use file_identify::{tags_from_path, tags_from_filename};

// Identify a Python file
let tags = tags_from_filename("script.py");
assert!(tags.contains("python"));
assert!(tags.contains("text"));

// Identify from filesystem path
let tags = tags_from_path(&file_path).unwrap();
assert!(tags.contains("file"));
assert!(tags.contains("python"));

§Tag System

Files are identified using a set of standardized tags:

  • Type tags: file, directory, symlink, socket
  • Mode tags: executable, non-executable
  • Encoding tags: text, binary
  • Language/format tags: python, javascript, json, xml, etc.

§Error Handling

Functions that access the filesystem return Result types. The main error conditions are:

Modules§

extensions
interpreters
tags

Enums§

IdentifyError
Errors that can occur during file identification.

Functions§

file_is_text
Determine if a file contains text or binary data.
is_text
Determine if data from a reader contains text or binary content.
parse_shebang
Parse a shebang line from a reader and return interpreter tags.
parse_shebang_from_file
Parse shebang line from an executable file and return interpreter tags.
tags_from_filename
Identify a file based only on its filename.
tags_from_interpreter
Identify tags based on a shebang interpreter.
tags_from_path
Identify a file from its filesystem path.

Type Aliases§

Result
Result type for file identification operations.