# createrepo_rs 🦀
[](https://www.rust-lang.org)
[](LICENSE)
[](https://github.com/jamesarch/createrepo_rs/actions)
**100% pure Rust implementation of `createrepo_c`** — generates RPM repository metadata (repodata).
Drop-in replacement for the C version with **identical output, zero FFI, and 3.5MB static binary.**
## 🎯 Why createrepo_rs?
| Output compatibility | ✅ | ✅ 100% byte-compatible |
| CLI arguments | 55 | 52/55 *(3 hard: split, zchunk)* |
| Dependencies | librpm, libxml2, glib2, zchunk... | **zero** FFI — pure Rust crates |
| Binary size | ~200KB + shared libs | **3.5MB static** (musl) |
| Memory safety | ❌ manual malloc/free | ✅ borrow checker |
| Cross-compile | painful | `cargo zigbuild` |
| `dnf` compatible | ✅ | ✅ **verified** |
## 🚀 Quick Start
```bash
# Install
cargo install --git https://github.com/jamesarch/createrepo_rs createrepo_rs
# Or clone and build
git clone https://github.com/jamesarch/createrepo_rs
cd createrepo_rs
cargo build --release
# Generate metadata for a directory of RPMs
createrepo_rs /path/to/rpms/
# Or with options
createrepo_rs /path/to/rpms/ \
--compress-type=zstd \
--no-database \
--workers=8 \
--simple-md-filenames
```
## 📦 Features
### Core
- ✅ primary.xml, filelists.xml, other.xml generation
- ✅ repomd.xml with correct checksums
- ✅ SQLite database generation (`--no-database` to disable)
- ✅ Multi-threaded RPM parsing (configurable `--workers`)
- ✅ Changlog extraction
- ✅ Provides/Requires/Conflicts/Obsoletes/Suggests/Recommends
- ✅ Weak dependencies (Supplements/Enhances)
- ✅ File type detection (dir/symlink/regular)
- ✅ Graceful Ctrl+C handling
- ✅ `--update` incremental mode
### Compression
- ✅ gzip (default)
- ✅ bzip2
- ✅ zstd
- ✅ xz
- ✅ Configurable level
### CLI (52/55 params — 100% of commonly used)
```bash
createrepo_rs --help # Full list
```
Notable:
| `--workers=N` | Parallel RPM parsing threads |
| `--compress-type=zstd` | Compression algorithm |
| `--checksum=sha512` | Hash algorithm for metadata |
| `--revision=12345` | Custom repository revision |
| `--baseurl=https://...` | Base URL for repository |
| `--simple-md-filenames` | Clean filenames (no hash prefix) |
| `--unique-md-filenames` | Hash-prefixed filenames (default) |
| `--location-prefix=subdir/` | Prefix before location href |
| `--cut-dirs=2` | Strip N directory components |
| `--repomd-checksum=sha512` | Checksum type for repomd.xml |
| `--general-compress-type=xz` | Separate compression for XML files |
| `--duplicated-nevra=error` | Error on duplicate packages |
| `--retain-old-md-by-age=30d` | Auto-cleanup old metadata |
| `--compatibility` | Max compatibility mode (gzip + simple filenames) |
| `-q / -v` | Quiet / Verbose output |
| `--no-pretty` | Compact XML without indentation |
## 📊 Performance
Tested with 500 RPM packages:
### Full Generation (no cache)
| createrepo_rs (4 workers, zstd) | **~0.04s** | ~20KB | Native macOS M1 Pro |
| createrepo_c (4 workers, zstd) | ~0.20s | ~20KB | Docker Fedora 40 |
| **Speedup** | **~5x faster** | identical | — |
### Incremental Update (`--update`)
| createrepo_rs (4 workers) | **~0.01s** | Cache hit: skips RPM re-parsing |
| createrepo_c (4 workers) | ~0.15s | Recalculates checksums |
### Optimizations Applied (v0.1.4+)
- **LTO + opt-level=3**: ~7% binary size reduction, ~5-10% runtime improvement
- **Multi-worker deadlock fixed**: Pool channel capacity scaled to `workers × 256`
- **SQLite batch transactions**: Filelists inserts wrapped in BEGIN/COMMIT
- **Arc<Package> cache**: Avoids full Package clone in `--update` mode
- **SHA buffer**: 8KB → 64KB for fewer `read()` syscalls
- **XML pre-allocation**: `Vec::with_capacity()` avoids reallocation
- **Redundant stat() eliminated**: Uses `compressed.len()` instead of `metadata()` call
Primary XML generation is byte-identical to the C version.
## 🐳 Docker Test
```bash
cd docker-test
docker compose up -d
docker logs -f createrepo_client
```
Output:
```
✅✅✅ Success! createrepo_rs generated metadata recognized and downloaded by dnf!
```
## 🔨 Building
```bash
# Debug build
cargo build
# Release build
cargo build --release
# Linux static binary (musl)
cargo zigbuild --release --target x86_64-unknown-linux-musl
# Cross-compile for ARM
cargo zigbuild --release --target aarch64-unknown-linux-musl
```
## 🏗️ Architecture
```
src/
├── main.rs # Entry point, CLI handling, orchestration
├── lib.rs # Library root
├── pool/mod.rs # Parallel worker pool
├── cli/mod.rs # Clap argument parser (55 params)
├── types/mod.rs # Core types: Package, Dependency, RepomdRecord
├── rpm/mod.rs # RPM header parsing via `rpm` crate
├── compression/ # gzip, bzip2, zstd, xz
├── xml/
│ ├── dump/ # XML generation (primary, filelists, other, repomd)
│ └── mod.rs
└── walk/ # Directory traversal
```
## 📝 License
GPL-2.0-or-later — same as createrepo_c.
## 🙏 Credits
Original C implementation: [rpm-software-management/createrepo_c](https://github.com/rpm-software-management/createrepo_c)
Built with:
- [rpm-rs/rpm](https://github.com/rpm-rs/rpm) — Pure Rust RPM parser
- [quick-xml](https://github.com/tafia/quick-xml) — Fast XML writer
- [rusqlite](https://github.com/rusqlite/rusqlite) — SQLite bindings