# Unity Asset Parser
A Rust implementation of Unity asset parsing, inspired by and learning from [UnityPy](https://github.com/K0lb3/UnityPy). This project focuses on parsing Unity YAML and binary formats with Rust's memory safety and performance characteristics.
## Project Status
**Early Development**: This is a learning project and reference implementation. It is **not production-ready** and has significant limitations compared to mature tools like UnityPy.
### What This Project Is
- **Learning Exercise**: Understanding Unity's file formats through Rust implementation
- **Parser Focus**: Emphasis on parsing and data extraction rather than manipulation
- **Rust Exploration**: Exploring how Rust's type system can help with binary parsing
- **Reference Implementation**: Code that others can learn from and build upon
### What This Project Is NOT
- **UnityPy Replacement**: UnityPy remains the most mature Python solution
- **Asset Editor**: This is a read-only parser, not an asset creation/editing tool
## Architecture
The project uses a workspace structure to organize different parsing capabilities:
```text
unity-asset/
├── unity-asset-core/ # Core data structures and traits
├── unity-asset-yaml/ # YAML file parsing (stable)
├── unity-asset-binary/ # Binary asset parsing (advanced, WIP)
├── unity-asset-decode/ # Optional decode/export helpers (Texture/Audio/Sprite/Mesh)
├── unity-asset-lib/ # Main library crate (published as `unity-asset`)
├── unity-asset-cli/ # CLI tools
│ ├── main.rs # Synchronous CLI tool
│ └── main_async.rs # Asynchronous CLI tool (--features async)
└── tests/ # Integration tests and sample files
```
### Crates
- `unity-asset` (library): main user-facing API. Start here for `Environment` (YAML + binary) and `YamlDocument`.
- `unity-asset-binary` (parser): low-level binary parsers (AssetBundle/SerializedFile/WebFile) plus fast object helpers (`ObjectHandle`).
- `unity-asset-decode` (decode/export): optional decode/export helpers behind feature flags (Texture/Audio/Sprite/Mesh).
- `unity-asset-yaml` (YAML): YAML-specific parsing/serialization; also re-exported via `unity-asset`.
- `unity-asset-core` (core): shared data structures, errors, and dynamic `UnityValue` types.
- `unity-asset-cli` (CLI): command-line tools (not required for library integration).
Examples are maintained per-crate and are built in CI. For instance:
`cargo run -p unity-asset --example env_load_and_list -- tests/samples`
`cargo run -p unity-asset-binary --example sniff_kind -- tests/samples/char_118_yuki.ab`
See [docs/EXAMPLES.md](docs/EXAMPLES.md) for a curated list of runnable recipes.
### Current Capabilities
#### YAML Processing (Complete)
- Unity YAML format parsing for common file types (.asset, .prefab, .unity)
- Multi-document parsing support
- Reference resolution and cross-document linking
- Filtering and querying capabilities
- Serialization back to YAML format
#### Binary Asset Processing (Advanced, WIP)
- AssetBundle structure parsing (UnityFS format)
- SerializedFile parsing with full object extraction
- TypeTree structure parsing and dynamic object reading
- Compression support (LZ4, LZMA, Brotli)
- Metadata extraction and analysis (experimental; includes dependency graph, best-effort hierarchy/component mapping, and external reference resolution via `externals`)
- Performance monitoring and basic statistics
#### Object Processing (Partial)
- **AudioClip**: Full format support (Vorbis, MP3, WAV, AAC) via `unity-asset-decode` (Symphonia-based decoder)
- **Texture2D**: Complete parsing + best-effort decoding + PNG export via `unity-asset-decode`
- **Sprite**: Full metadata extraction + atlas support + image cutting via `unity-asset-decode`
- **Mesh**: Structure parsing + vertex data extraction + basic export via `unity-asset-decode`
- **GameObject/Transform**: Basic TypeTree-based hierarchy & component mapping (best-effort; still WIP)
#### CLI Tools (Usable, WIP)
- Synchronous CLI for file inspection and batch processing
- Asynchronous CLI with concurrent processing and progress tracking
- Export capabilities (PNG, OGG, WAV, basic mesh formats)
- Comprehensive metadata analysis and reporting
- Basic progress reporting
- `list-objects`: list objects from SerializedFiles/bundles (path_id/class_id/type/name) using TypeTree fast paths
- `export-serialized`: export objects from `.asset/.assets` by scanning objects directly (raw `.bin`, optional best-effort decode)
**Known Limitations**
- Some advanced Unity asset types not yet implemented (MonoBehaviour scripts, complex shaders)
- Object manipulation is read-only (no writing back to Unity formats)
- Some edge cases in LZMA decompression may fail on corrupted data
- Advanced texture formats require `unity-asset-decode` `texture-advanced` feature (DXT, ETC, ASTC)
- Audio decoding requires `unity-asset-decode` `audio` feature (Symphonia integration)
- Large file performance could be optimized further
- Error messages could be more user-friendly
- Some metadata/dependency/hierarchy analyses are currently simplified placeholders
- Object data is lazily accessed by default; use `SerializedFileParser::from_bytes_with_options(data, true)` if you explicitly need per-object preloaded buffers
- For large objects without TypeTree, raw bytes are not expanded into `_raw_data` for performance; use `UnityObject::raw_data()`
- TypeTree byte-like fields are represented as `UnityValue::Bytes` (not `Array<Integer>`) for performance: `TypelessData` and `vector<UInt8/SInt8/char>`. Use `UnityValue::as_bytes()` to access them.
## Quick Start
### Requirements
- Rust stable (`edition = "2024"`)
- `cargo-nextest` (recommended for running tests)
### Installation
**Note**: crates.io publishing is planned, but not guaranteed yet. For now, to try it out:
```bash
# Clone and build from source
git clone https://github.com/Latias94/unity-asset.git
cd unity-asset
# Build the library
cargo build --workspace
# Run tests (recommended)
cargo nextest run --workspace
# Try the CLI tools
cargo run --bin unity-asset -- --help
cargo run --features async --bin unity-asset-async -- --help
```
For maintainers: the tag-driven publish/release process is documented in `docs/RELEASING.md`.
Once `v0.2.0` is published to crates.io, you'll be able to install it with:
```toml
# Add to your Cargo.toml
[dependencies]
unity-asset = "0.2.0"
```
```bash
# Install CLI tools
cargo install unity-asset-cli
```
By default, `unity-asset-cli` is built without decode/export helpers to keep builds lighter. To enable `--decode` support and best-effort exporters:
```bash
cargo install unity-asset-cli --features decode
```
### Testing Status
We have basic tests for core functionality, but this is not a comprehensive test suite. Some tests pass, others reveal limitations in our implementation.
### Comparison with UnityPy
[UnityPy](https://github.com/K0lb3/UnityPy) is a mature, feature-complete Python library for Unity asset manipulation. This Rust project is:
- **Much less mature**: UnityPy has years of development and community contributions
- **More limited**: We focus on parsing, not manipulation or export
- **Learning-oriented**: This project helps understand Unity formats through Rust
- **Experimental**: Many features are incomplete or missing
If you need a production tool for Unity asset processing, **use UnityPy instead**.
## Basic Usage Examples
### YAML File Parsing
```rust
use unity_asset::{YamlDocument, UnityDocument};
// Load a Unity YAML file
let (doc, warnings) = YamlDocument::load_yaml_with_warnings("ProjectSettings.asset", false)?;
for w in warnings {
eprintln!("warning: {}", w);
}
// Get basic information
println!("Found {} entries", doc.entries().len());
// Try to find specific objects (may not work for all files)
if let Ok(settings) = doc.get(Some("PlayerSettings"), None) {
println!("Found PlayerSettings");
}
```
### UnityPy-like Environment (YAML + Binary)
```rust
use unity_asset::environment::{Environment, EnvironmentObjectRef};
use unity_asset_binary::typetree::JsonTypeTreeRegistry;
use std::sync::Arc;
let mut env = Environment::new();
// Optional: provide an external TypeTree registry for stripped assets (best-effort).
// This can improve coverage when `enableTypeTree = false` in serialized files.
// let registry = JsonTypeTreeRegistry::from_path("typetree_registry.json")?;
// env.set_type_tree_registry(Some(Arc::new(registry)));
env.load("tests/samples")?;
// `path_id` is only unique within a single SerializedFile.
// Use `BinaryObjectKey` when you need a globally-unique handle you can round-trip later.
let sources = env.binary_sources();
if let Some((_kind, source_path)) = sources.first() {
let keys = env.find_binary_object_keys_in_source(source_path, 1);
if let Some(key) = keys.first() {
let _parsed = env.read_binary_object_key(key)?;
}
}
// Unity `PPtr` resolution needs a context object because `fileID` indexes into the context file's externals.
// For the common case `fileID=0`, it points to the same SerializedFile as the context.
if let Some(obj_ref) = env.find_binary_object(1) {
let _pptr_obj = env.read_binary_pptr(&obj_ref, 0, 1)?;
}
// AssetBundles often expose a container mapping from asset paths to objects.
// This is the primary discovery mechanism in UnityPy.
// Note: This is best-effort; when TypeTree is stripped, we fall back to a raw binary parser for `m_Container`.
let container = env.find_binary_object_keys_in_bundle_container("Assets/");
for (asset_path, key) in container.into_iter().take(10) {
let _obj = env.read_binary_object_key(&key)?;
println!("{} -> path_id={}", asset_path, key.path_id);
}
for obj in env.objects() {
match obj {
EnvironmentObjectRef::Yaml(class) => {
let _ = &class.class_name;
}
EnvironmentObjectRef::Binary(obj_ref) => {
// Parse on-demand (best-effort)
let _parsed = obj_ref.read()?;
let _key = obj_ref.key();
}
}
}
```
### CLI Usage
```bash
# Parse a single YAML file
cargo run --bin unity-asset -- parse-yaml -i ProjectSettings.asset
# List bundle nodes (files) for debugging/inspection
cargo run --bin unity-asset -- list-bundle -i tests/samples/char_118_yuki.ab --filter "CAB-" --verbose
# Find objects via AssetBundle `m_Container` (discovery)
cargo run --bin unity-asset -- find-object -i tests/samples/char_118_yuki.ab --pattern "Assets/" --limit 20 --verbose
# Filter by type (useful for scripts/batch workflows)
cargo run --bin unity-asset -- find-object -i tests/samples/char_118_yuki.ab --class-name "Texture2D" --limit 20
# Filter by object name (best-effort; requires TypeTree and a name field)
cargo run --bin unity-asset -- find-object -i tests/samples/char_118_yuki.ab --name "yuki" --limit 20 --verbose
# Dump an external TypeTree registry (best-effort fallback for stripped assets)
cargo run --bin unity-asset -- dump-typetree-registry -i tests/samples -o typetree_registry.json --version-prefix
# Use an external TypeTree registry during discovery/inspection (best-effort)
cargo run --bin unity-asset -- --typetree-registry typetree_registry.json find-object -i tests/samples --pattern "Assets/" --limit 20 --verbose
# `--typetree-registry` also accepts a UnityPy-compatible `.tpk` file and can be repeated:
cargo run --bin unity-asset -- \
--typetree-registry typetree_registry.json \
--typetree-registry unitypy.tpk \
find-object -i tests/samples --pattern "Assets/" --limit 20 --verbose
# Inspect a single object (TypeTree / Null-field debugging)
# - Easiest: copy/paste the `key=bok2|...` line from `find-object --verbose` and use `--key`.
# - Or pass the location fields explicitly (use `--kind serialized` for standalone `.assets` files).
# If you suspect TypeTree mismatches, enable fail-fast parsing and print warnings:
# `--strict` (fail-fast) and `--show-warnings` (print TypeTree warnings)
# Scan PPtr references without fully parsing objects (fast dependency/graph workflows)
cargo run --bin unity-asset -- scan-pptr -i tests/samples/char_118_yuki.ab --kind bundle --asset-index 0 --limit 5
cargo run --bin unity-asset -- scan-pptr -i tests/samples/char_118_yuki.ab --kind bundle --asset-index 0 --class-id 114 --json
# Build a best-effort dependency graph via TypeTree PPtr scanning
cargo run --bin unity-asset -- deps -i tests/samples/char_118_yuki.ab --kind bundle --asset-index 0 --format summary
cargo run --bin unity-asset -- deps -i tests/samples/char_118_yuki.ab --kind bundle --asset-index 0 --format dot --max-edges 2000 > graph.dot
# Export objects from AssetBundles via `m_Container` (UnityPy-like workflow)
cargo run --bin unity-asset -- export-bundle -i tests/samples -o out/ --pattern "Assets/" --limit 50
# Decode known types (best-effort):
# - `AudioClip`: export embedded/streamed audio bytes (e.g. `.ogg`) or decode to `.wav` fallback
# - `Texture2D`: decode and export as `.png`
# - `Sprite`: resolve referenced `Texture2D`, crop sprite rect, and export as `.sprite.png`
cargo run --bin unity-asset -- export-bundle -i tests/samples -o out/ --pattern "Assets/" --decode --limit 50
#
# Parallelize export/decode work:
cargo run --bin unity-asset -- export-bundle -i tests/samples -o out/ --pattern "Assets/" --decode --jobs 0
#
# Filter by type (reduces work on large bundles):
cargo run --bin unity-asset -- export-bundle -i tests/samples -o out/ --pattern "Assets/" --class-name "Texture2D" --decode --jobs 0
#
# Re-runs: keep output deterministic and resumable via a manifest
# - `--continue-on-error` records failures as `status=failed` with an error string
# - `--resume` skips already-exported entries (when the previous output path still exists)
# - `--retry-failed-from` re-exports only entries that failed previously
cargo run --bin unity-asset -- export-bundle -i tests/samples -o out/ --pattern "Assets/" --decode \\
--manifest out/manifest.json --continue-on-error --jobs 0
cargo run --bin unity-asset -- export-bundle -i tests/samples -o out/ --pattern "Assets/" --decode \\
--retry-failed-from out/manifest.json --manifest out/manifest.retry.json --continue-on-error --jobs 0
cargo run --bin unity-asset -- export-bundle -i tests/samples -o out/ --pattern "Assets/" --decode \\
--resume out/manifest.retry.json --manifest out/manifest.resume.json --jobs 0
#
# Note: outputs are raw SerializedFile object bytes (`.bin`), not necessarily the original file format.
#
# Note (streamed resources): some `AudioClip`/`Texture2D` objects reference external `.resS`/`.resource`
# files that are not embedded inside the bundle. `export-bundle --decode` will try:
# 1) resource nodes inside the same bundle (UnityFS)
# 2) sibling resource files on disk (same directory / `StreamingAssets/`)
# If the resource file is missing, it falls back to exporting raw `.bin`.
# Try async processing (experimental)
cargo run --features async --bin unity-asset-async -- \
parse-yaml -i Assets/ --recursive --progress
```
## Architecture Details
This project is organized as a Rust workspace with separate crates for different concerns:
- **`unity-asset-core`**: Core data structures and traits
- **`unity-asset-yaml`**: YAML format parsing
- **`unity-asset-binary`**: Binary format parsing (AssetBundle, SerializedFile)
- **`unity-asset-decode`**: Optional decode/export helpers (Texture/Audio/Sprite/Mesh)
- **`unity-asset-lib`**: Main library crate (published as `unity-asset`)
- **`unity-asset-cli`**: Command-line tools (published as `unity-asset-cli`)
## Acknowledgments
This project is a learning exercise inspired by and learning from several excellent projects:
### **[UnityPy](https://github.com/K0lb3/UnityPy)** by [@K0lb3](https://github.com/K0lb3)
- The gold standard for Unity asset manipulation
- Our primary reference for understanding Unity formats
- Test cases and expected behavior patterns
### **[unity-rs](https://github.com/yuanyan3060/unity-rs)** by [@yuanyan3060](https://github.com/yuanyan3060)
- Pioneering Rust implementation of Unity asset parsing
- Architecture and parsing technique inspiration
- Binary format handling examples
### **[unity-yaml-parser](https://github.com/socialpoint-labs/unity-yaml-parser)** by [@socialpoint-labs](https://github.com/socialpoint-labs)
- Original inspiration for this project
- YAML format expertise and reference resolution patterns
- Clean API design principles
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.