casq
A production-ready content-addressed file store CLI with compression and chunking (v0.4.0).
Overview
casq is a command-line tool for managing content-addressed storage. It stores files and directories by their cryptographic hash, providing automatic deduplication, transparent compression, content-defined chunking, garbage collection, and named references.
This is the CLI binary that uses the casq_core library.
Installation
# Build from source
# The binary will be at target/release/casq
# Install it by
# Optionally, copy to your PATH
Quick Start
# Initialize a new store
# Add files or directories
# Add content from stdin (pipe data directly)
|
|
# Add with a named reference
# Discover what content you have
# List tree contents
# Output blob content
# Show object metadata
# Materialize (restore) to filesystem
# Garbage collect unreferenced objects
Commands
casq init
Initialize a new content-addressed store.
)
Creates the store directory structure at the configured root (default: ./casq-store).
casq add <PATH>... or casq add -
Add files, directories, or stdin content to the store.
<PATH>...
)
Examples:
# Add a single file
# Add multiple files
# Add with a reference
# Add from stdin
|
|
|
The command outputs the hash of each added object. Directories are added recursively and stored as tree objects. Stdin content is stored as a blob.
Important notes:
- When using stdin (
-), you cannot mix it with filesystem paths - Stdin can only be specified once per invocation
- Output format for stdin:
<hash> (stdin)
casq materialize <HASH> <DEST>
Materialize (restore) an object from the store to the filesystem.
<HASH> Hash
<DEST> Destination )
Examples:
# Restore a directory
# Restore a file
casq cat <HASH>
Output blob content to stdout.
<HASH> Hash
Examples:
# View a text file
# Pipe to another command
|
# Save to a file
casq ls [HASH]
List references (if no hash), tree contents, or blob info.
)
Examples:
# List all references (discover what content you have)
# List all references with type info
# List directory contents
# Show detailed listing with modes and hashes
# Output format for refs:
# my-backup -> abc123...
# Output format for trees (--long):
# b 100644 <hash> filename.txt
# t 040755 <hash> subdir
Type codes: b = blob (file), t = tree (directory)
Tip: Use casq ls to discover content, then casq ls <hash> to explore it.
casq stat <HASH>
Show detailed metadata about an object.
<HASH> Hash
Example output:
Hash: abc123...
Type: tree
Entries: 5
Size: 320 bytes (on disk)
Path: ./casq-store/objects/blake3-256/ab/c123...
casq gc
Garbage collect unreferenced objects.
Examples:
# Preview what would be deleted
# Actually delete unreferenced objects
Walks from all named references and deletes objects that are no longer reachable.
casq refs add <NAME> <HASH>
Add a named reference to an object.
<NAME> Reference
<HASH> Hash
Examples:
References act as GC roots - objects reachable from references won't be deleted by gc.
casq refs list
List all references.
Example output:
backup-2024 -> abc123...
important -> def456...
casq refs rm <NAME>
Remove a reference.
<NAME> Reference
Example:
Global Options
All commands support these global options:
Store Root Priority
The store root is determined in this order:
--rootCLI argumentCASTOR_ROOTenvironment variable./casq-store(default)
Examples:
# Use explicit root
# Use environment variable
# Use default (./casq-store)
Typical Workflows
Backup Workflow
# Initialize store
# Create initial backup
# Add more data later
# List all backups
# Restore a backup
# Clean up old backups
Deduplication Example
# Add the same file twice
# Output: abc123...
# Output: abc123... (same hash - deduplicated!)
# Only one copy stored internally
Exploring Content
# Add a directory with a reference
# Discover what's in your store
# Output: current-work -> abc123...
# Explore the tree
HASH=
# Look at a specific file
FILE_HASH=
Store Structure
casq-store/
├── config # Store configuration
├── objects/
│ └── blake3-256/ # Algorithm-specific directory
│ ├── ab/ # Shard directory (first 2 hex chars)
│ │ └── cd...ef # Object file (remaining 62 hex chars)
│ └── ...
└── refs/ # Named references
├── backup-2024
└── important
Object Types
- Blob - Raw file content (automatically compressed if ≥ 4KB)
- Tree - Directory listing (sorted entries)
- ChunkList - Large file split into chunks (files ≥ 1MB, enables incremental backups)
Trees reference other blobs and trees, forming a hierarchical structure similar to git. Large files are split into chunks for efficient incremental backups and cross-file deduplication.
Exit Codes
0- Success1- Error (with descriptive message to stderr)
Environment Variables
CASTOR_ROOT- Default store root directory
Error Handling
All commands provide clear error messages:
)
Performance Tips
- Large files - Content is streamed, not buffered in memory
- Many small files - Use directories to group them
- Deduplication - Identical content is stored only once (including chunk-level deduplication)
- Compression - Files ≥ 4KB automatically compressed with zstd (3-5x typical reduction)
- Chunking - Files ≥ 1MB split into chunks for incremental backups (change 1 byte → store ~512KB)
- GC frequency - Run
gcperiodically to reclaim space from unreferenced objects
Storage Efficiency (v0.4.0+)
- Compression: 3-5x reduction for text files, 2-3x for mixed data
- Chunking: Change 1 byte in 1GB file → store only ~512KB (changed chunk)
- Cross-file deduplication: Shared content across files stored only once
- Example: 10 files with identical 5MB section = 5MB stored (not 50MB)
Limitations
- No encryption - Store plaintext only (planned for future)
- No network - Local-only storage
- No parallel operations - Single-threaded (may be added in future)
- POSIX only - Full permission preservation only on Unix-like systems
Comparison to Git
| Feature | casq | Git |
|---|---|---|
| Content addressing | ✓ | ✓ |
| Deduplication | ✓ | ✓ |
| Trees/Blobs | ✓ | ✓ |
| Hash algorithm | BLAKE3 | SHA-1/SHA-256 |
| Commits | ✗ | ✓ |
| Branches | ✗ | ✓ |
| Diffs | ✗ | ✓ |
| Network | ✗ | ✓ |
| Use case | File storage | Version control |
casq is simpler than git - it's just content-addressed storage without the version control features.
Troubleshooting
Store not found
# Solution: Initialize the store first
Object not found
# Solution: Verify the hash is correct
Path already exists
# Solution: Remove the destination first or use a different path
Development
# Run from source
# Build optimized binary
# Run tests
# Format code
# Lint
License
Apache-2.0
See Also
- casq_core - The library powering this CLI
- NOTES.md - Design and specification
- CLAUDE.md - Development guidelines