Crate cacache

Expand description

cacache is a Rust library for managing local key and content address caches. It’s really fast, really good at concurrency, and it will never give you corrupted data, even if cache files get corrupted or manipulated.

API Layout

The cacache API is organized roughly similar to std::fs; most of the toplevel functionality is available as free functions directly in the cacache module, with some additional functionality available through returned objects, as well as WriteOpts, which is analogous to OpenOpts, but is only able to write.

One major difference is that the default APIs are all async functions, as opposed to std::fs, where they’re all synchronous. Synchronous APIs in cacache are accessible through the _sync suffix.

Suffixes

You may notice various suffixes associated with otherwise familiar functions:

_sync - Most cacache APIs are asynchronous by default. Anything using the _sync suffix behaves just like its unprefixed counterpart, except the operation is synchronous.
_hash - Since cacache is a content-addressable cache, the _hash suffix means you’re interacting directly with content data, skipping the index and its metadata. These functions use an Integrity to look up data, instead of a string key.

Examples

Un-suffixed APIs are all async, using async-std. They let you put data in and get it back out – asynchronously!

use async_attributes;

#[async_attributes::main]
async fn main() -> cacache::Result<()> {
  // Data goes in...
  cacache::write("./my-cache", "key", b"hello").await?;

  // ...data comes out!
  let data = cacache::read("./my-cache", "key").await?;
  assert_eq!(data, b"hello");

  Ok(())
}

Lookup by hash

What makes cacache content addressable, though, is its ability to fetch data by its “content address”, which in our case is a “subresource integrity” hash, which cacache::put conveniently returns for us. Fetching data by hash is significantly faster than doing key lookups:

use async_attributes;

#[async_attributes::main]
async fn main() -> cacache::Result<()> {
  // Data goes in...
  let sri = cacache::write("./my-cache", "key", b"hello").await?;

  // ...data gets looked up by `sri` ("Subresource Integrity").
  let data = cacache::read_hash("./my-cache", &sri).await?;
  assert_eq!(data, b"hello");

  Ok(())
}

Large file support

cacache supports large file reads, in both async and sync mode, through an API reminiscent of std::fs::OpenOptions:

use async_attributes;
use async_std::prelude::*;

#[async_attributes::main]
async fn main() -> cacache::Result<()> {
  let mut fd = cacache::Writer::create("./my-cache", "key").await?;
  for _ in 0..10 {
    fd.write_all(b"very large data").await.expect("Failed to write to cache");
  }
  // Data is only committed to the cache after you do `fd.commit()`!
  let sri = fd.commit().await?;
  println!("integrity: {}", &sri);

  let mut fd = cacache::Reader::open("./my-cache", "key").await?;
  let mut buf = String::new();
  fd.read_to_string(&mut buf).await.expect("Failed to read to string");

  // Make sure to call `.check()` when you're done! It makes sure that what
  // you just read is actually valid. `cacache` always verifies the data
  // you get out is what it's supposed to be. The check is very cheap!
  fd.check()?;

  Ok(())
}

Sync API

There are also sync APIs available if you don’t want to use async/await. The synchronous APIs are generally faster for linear operations – that is, doing one thing after another, as opposed to doing many things at once. If you’re only reading and writing one thing at a time across your application, you probably want to use these instead.

fn main() -> cacache::Result<()> {
  cacache::write_sync("./my-cache", "key", b"my-data").unwrap();
  let data = cacache::read_sync("./my-cache", "key").unwrap();
  assert_eq!(data, b"my-data");
  Ok(())
}

Structs

Metadata

Represents a cache index entry, which points to content.

Reader

File handle for reading data asynchronously.

SyncReader

File handle for reading data synchronously.

SyncWriter

A reference to an open file writing to the cache.

WriteOpts

Builder for options and flags for opening a new cache file to write data into.

Writer

A reference to an open file writing to the cache.

Enums

Algorithm

Valid algorithms for integrity strings.

Error

Error type returned by all API calls.

Value

Represents any valid JSON value.

Functions

clear

Removes entire contents of the cache, including temporary files, the entry index, and all content data.

clear_sync

Removes entire contents of the cache synchronously, including temporary files, the entry index, and all content data.

copy

Copies cache data to a specified location. Returns the number of bytes copied.

copy_hash

Copies a cache data by hash to a specified location. Returns the number of bytes copied.

copy_hash_sync

Copies a cache entry by integrity address to a specified location. Returns the number of bytes copied.

copy_sync

Copies a cache entry by key to a specified location. Returns the number of bytes copied.

exists

Returns true if the given hash exists in the cache.

exists_sync

Returns true if the given hash exists in the cache.

list_sync

Returns a synchronous iterator that lists all cache index entries.

metadata

Gets the metadata entry for a certain key.

metadata_sync

Gets metadata for a certain key.

read

Reads the entire contents of a cache file into a bytes vector, looking the data up by key.

read_hash

Reads the entire contents of a cache file into a bytes vector, looking the data up by its content address.

read_hash_sync

Reads the entire contents of a cache file synchronously into a bytes vector, looking the data up by its content address.

read_sync

Reads the entire contents of a cache file synchronously into a bytes vector, looking the data up by key.

remove

Removes an individual index metadata entry. The associated content will be left in the cache.

remove_hash

Removes an individual content entry. Any index entries pointing to this content will become invalidated.

remove_hash_sync

Removes an individual content entry synchronously. Any index entries pointing to this content will become invalidated.

remove_sync

Removes an individual index entry synchronously. The associated content will be left in the cache.

write

Writes data to the cache, indexing it under key.

write_hash

Writes data to the cache, skipping associating an index key with it.

write_hash_sync

Writes data to the cache synchronously, skipping associating a key with it.

write_sync

Writes data to the cache synchronously, indexing it under key.

Type Definitions

Result

The result type returned by calls to this library