auto_encoder
auto_encoder
is a Rust library designed to automatically detect and encode various text and binary file formats, along with specific language encodings.
Features
- Automatic Encoding Detection: Detects text encoding based on locale or content.
- Binary Format Detection: Checks if a given file is a known binary format by inspecting its initial bytes.
- HTML Language Detection: Extracts and detects the language of an HTML document from its content.
Installation
Add this to your Cargo.toml
:
[]
= "0.1"
Usage
Encoding Detection
Automatically detect the encoding for a given locale:
use encoding_for_locale;
let encoding = encoding_for_locale.unwrap;
println!;
Encode bytes from a given HTML content and language:
use encode_bytes_from_language;
let html_content = b"こんにちは、世界!";
let encoded = encode_bytes_from_language;
println!;
Binary Format Detection
Check if a given file content is a known binary format:
use is_binary_file;
let file_content = &; // JPEG file signature
let is_binary = is_binary_file;
println!;
HTML Language Detection
Detect the language attribute from an HTML document:
use detect_language;
let html_content = br#"<html lang="en"><head><title>Test</title></head><body></body></html>"#;
let language = detect_language.unwrap;
println!;
API Documentation
Functions
encoding_for_locale
Get the encoding for a given locale if found.
;
is_binary_file
Check if the file is a known binary format using its initial bytes.
;
detect_language
Detect the language of an HTML resource based on its content.
;
encode_bytes
Get the content with proper encoding. Pass in a proper encoding label like SHIFT_JIS
.
;
encode_bytes_from_language
Get the content with proper encoding based on a language code (e.g., ja
for Japanese).
;
Supported Locales and Encodings
The library supports a wide range of locales and their corresponding encodings, such as WINDOWS_1252
for Western European languages, SHIFT_JIS
for Japanese, GB18030
for Simplified Chinese, etc.
Contributing
Contributions are welcome! Please feel free to open an issue or submit a pull request on GitHub.
License
This project is licensed under the MIT License. See the LICENSE file for details.