Expand description
cacache is a Rust library for managing local key and content address caches. It’s really fast, really good at concurrency, and it will never give you corrupted data, even if cache files get corrupted or manipulated.
§API Layout
The cacache API is organized roughly similar to std::fs
; most of the
toplevel functionality is available as free functions directly in the
cacache
module, with some additional functionality available through
returned objects, as well as WriteOpts
, which is analogous to
OpenOpts
, but is only able to write.
One major difference is that the default APIs are all async functions, as
opposed to std::fs
, where they’re all synchronous. Synchronous APIs in
cacache are accessible through the _sync
suffix.
§Suffixes
You may notice various suffixes associated with otherwise familiar functions:
_sync
- Most cacache APIs are asynchronous by default. Anything using the_sync
suffix behaves just like its unprefixed counterpart, except the operation is synchronous._hash
- Since cacache is a content-addressable cache, the_hash
suffix means you’re interacting directly with content data, skipping the index and its metadata. These functions use anIntegrity
to look up data, instead of a string key.
§Examples
Un-suffixed APIs are all async, using
async-std
. They let you put data
in and get it back out – asynchronously!
use async_attributes;
#[async_attributes::main]
async fn main() -> cacache::Result<()> {
// Data goes in...
cacache::write("./my-cache", "key", b"hello").await?;
// ...data comes out!
let data = cacache::read("./my-cache", "key").await?;
assert_eq!(data, b"hello");
Ok(())
}
§Lookup by hash
What makes cacache
content addressable, though, is its ability to fetch
data by its “content address”, which in our case is a “subresource
integrity” hash, which cacache::put
conveniently returns for us. Fetching data by hash is significantly faster
than doing key lookups:
use async_attributes;
#[async_attributes::main]
async fn main() -> cacache::Result<()> {
// Data goes in...
let sri = cacache::write("./my-cache", "key", b"hello").await?;
// ...data gets looked up by `sri` ("Subresource Integrity").
let data = cacache::read_hash("./my-cache", &sri).await?;
assert_eq!(data, b"hello");
Ok(())
}
§Large file support
cacache
supports large file reads, in both async and sync mode, through
an API reminiscent of std::fs::OpenOptions
:
use async_attributes;
use async_std::prelude::*;
#[async_attributes::main]
async fn main() -> cacache::Result<()> {
let mut fd = cacache::Writer::create("./my-cache", "key").await?;
for _ in 0..10 {
fd.write_all(b"very large data").await.expect("Failed to write to cache");
}
// Data is only committed to the cache after you do `fd.commit()`!
let sri = fd.commit().await?;
println!("integrity: {}", &sri);
let mut fd = cacache::Reader::open("./my-cache", "key").await?;
let mut buf = String::new();
fd.read_to_string(&mut buf).await.expect("Failed to read to string");
// Make sure to call `.check()` when you're done! It makes sure that what
// you just read is actually valid. `cacache` always verifies the data
// you get out is what it's supposed to be. The check is very cheap!
fd.check()?;
Ok(())
}
§Sync API
There are also sync APIs available if you don’t want to use async/await. The synchronous APIs are generally faster for linear operations – that is, doing one thing after another, as opposed to doing many things at once. If you’re only reading and writing one thing at a time across your application, you probably want to use these instead.
If you wish to only use sync APIs and not pull in an async runtime, you can disable default features:
# Cargo.toml
[dependencies]
cacache = { version = "X.Y.Z", default-features = false, features = ["mmap"] }
fn main() -> cacache::Result<()> {
cacache::write_sync("./my-cache", "key", b"my-data").unwrap();
let data = cacache::read_sync("./my-cache", "key").unwrap();
assert_eq!(data, b"my-data");
Ok(())
}
§Linking to existing files
The link_to
feature enables an additional set of APIs for adding
existing files into the cache via symlinks, without having to duplicate
their data. Once the cache links to them, these files can be accessed by
key just like other cached data, with the same integrity checking.
The link_to
methods are available in both async and sync variants, using
the same suffixes as the other APIs.
#[async_attributes::main]
async fn main() -> cacache::Result<()> {
#[cfg(feature = "link_to")]
cacache::link_to("./my-cache", "key", "/path/to/my-other-file.txt").await?;
let data = cacache::read("./my-cache", "key").await?;
assert_eq!(data, b"my-data");
Ok(())
}
Re-exports§
pub use index::Metadata;
pub use index::RemoveOpts;
Modules§
- index
- Raw access to the cache index. Use with caution!
Structs§
- Integrity
- Representation of a full Subresource Integrity string.
- Reader
- File handle for reading data asynchronously.
- Sync
Reader - File handle for reading data synchronously.
- Sync
Writer - A reference to an open file writing to the cache.
- Write
Opts - Builder for options and flags for opening a new cache file to write data into.
- Writer
- A reference to an open file writing to the cache.
Enums§
- Algorithm
- Valid algorithms for integrity strings.
- Error
- Error type returned by all API calls.
- Value
- Represents any valid JSON value.
Functions§
- clear
- Removes entire contents of the cache, including temporary files, the entry index, and all content data.
- clear_
sync - Removes entire contents of the cache synchronously, including temporary files, the entry index, and all content data.
- copy
- Copies cache data to a specified location. Returns the number of bytes copied.
- copy_
hash - Copies a cache data by hash to a specified location. Returns the number of bytes copied.
- copy_
hash_ sync - Copies a cache entry by integrity address to a specified location. Returns the number of bytes copied.
- copy_
hash_ unchecked - Copies a cache data by hash to a specified location. Copied data will not be checked against the given hash.
- copy_
hash_ unchecked_ sync - Copies a cache entry by integrity address to a specified location. Does not verify cache contents while copying.
- copy_
sync - Copies a cache entry by key to a specified location. Returns the number of bytes copied.
- copy_
unchecked - Copies cache data to a specified location. Cache data will not be checked during copy.
- copy_
unchecked_ sync - Copies a cache entry by key to a specified location. Does not verify cache contents while copying.
- exists
- Returns true if the given hash exists in the cache.
- exists_
sync - Returns true if the given hash exists in the cache.
- hard_
link - Hard links a cache entry by key to a specified location.
- hard_
link_ hash - Hard links a cache entry by hash to a specified location.
- hard_
link_ hash_ sync - Hard links a cache entry by integrity address to a specified location, verifying contents as hard links are created.
- hard_
link_ hash_ unchecked_ sync - Hard links a cache entry by integrity address to a specified location. The cache entry contents will not be checked, and all the usual caveats of hard links apply: The potentially-shared cache might be corrupted if the hard link is modified.
- hard_
link_ sync - Hard links a cache entry by key to a specified location.
- hard_
link_ unchecked_ sync - Hard links a cache entry by key to a specified location. The cache entry contents will not be checked, and all the usual caveats of hard links apply: The potentially-shared cache might be corrupted if the hard link is modified.
- list_
sync - Returns a synchronous iterator that lists all cache index entries.
- metadata
- Gets the metadata entry for a certain key.
- metadata_
sync - Gets metadata for a certain key.
- read
- Reads the entire contents of a cache file into a bytes vector, looking the data up by key.
- read_
hash - Reads the entire contents of a cache file into a bytes vector, looking the data up by its content address.
- read_
hash_ sync - Reads the entire contents of a cache file synchronously into a bytes vector, looking the data up by its content address.
- read_
sync - Reads the entire contents of a cache file synchronously into a bytes vector, looking the data up by key.
- reflink
- Creates a reflink/clonefile from a cache entry to a destination path.
- reflink_
hash - Reflinks/clonefiles cache data by hash to a specified location.
- reflink_
hash_ sync - Reflinks/clonefiles cache data by hash to a specified location.
- reflink_
hash_ unchecked_ sync - Reflinks/clonefiles cache data by hash to a specified location. Cache data will not be checked during linking.
- reflink_
sync - Creates a reflink/clonefile from a cache entry to a destination path.
- reflink_
unchecked - Reflinks/clonefiles cache data to a specified location. Cache data will not be checked during linking.
- reflink_
unchecked_ sync - Reflinks/clonefiles cache data to a specified location. Cache data will not be checked during linking.
- remove
- Removes an individual index metadata entry. The associated content will be left in the cache.
- remove_
hash - Removes an individual content entry. Any index entries pointing to this content will become invalidated.
- remove_
hash_ sync - Removes an individual content entry synchronously. Any index entries pointing to this content will become invalidated.
- remove_
sync - Removes an individual index entry synchronously. The associated content will be left in the cache.
- write
- Writes
data
to thecache
, indexing it underkey
. - write_
hash - Writes
data
to thecache
, skipping associating an index key with it. - write_
hash_ sync - Writes
data
to thecache
synchronously, skipping associating a key with it. - write_
hash_ sync_ with_ algo - Writes
data
to thecache
synchronously, skipping associating a key with it. - write_
hash_ with_ algo - Writes
data
to thecache
, skipping associating an index key with it. Use this to customize the hashing algorithm. - write_
sync - Writes
data
to thecache
synchronously, indexing it underkey
. - write_
sync_ with_ algo - Writes
data
to thecache
synchronously, indexing it underkey
. Use this to customize the hashing algorithm. - write_
with_ algo - Writes
data
to thecache
, indexing it underkey
. Use this function to customize the hashing algorithm.
Type Aliases§
- Result
- The result type returned by calls to this library