DB

Struct DB 

Source
pub struct DB {
    options: DbOptions,
    db_lock: Option<FileLock>,
    memtable_ptr: ArcSwap<Box<dyn MemTable>>,
    wal: AtomicPtr<UnsafeCell<LogWriter>>,
    table_cache: Arc<TableCache>,
    guarded_fields: Arc<Mutex<GuardedDbFields>>,
    file_name_handler: Arc<FileNameHandler>,
    is_shutting_down: Arc<AtomicBool>,
    has_immutable_memtable: Arc<AtomicBool>,
    compaction_worker: Arc<CompactionWorker>,
    background_work_finished_signal: Arc<Condvar>,
}
Expand description

The primary database object that exposes the public API.

Fields§

§options: DbOptions

Options for configuring the operation of the database.

§db_lock: Option<FileLock>

A lock over the persistent (i.e. on disk) state of the database.

§memtable_ptr: ArcSwap<Box<dyn MemTable>>

An in-memory table of key-value pairs to support quick access to recently changed values.

All operations (reads and writes) go through this in-memory representation first.

§Concurrency

We use ArcSwap because we need a combination of an AtomicPtr and an Arc. Putting an Arc into an AtomicPtr doesn’t work because storing/loading through the AtomicPtr does not change the Arc’s reference counts.

§wal: AtomicPtr<UnsafeCell<LogWriter>>

The writer for the current write-ahead log file.

§table_cache: Arc<TableCache>

A cache of table files.

§guarded_fields: Arc<Mutex<GuardedDbFields>>

Database fields that require a lock for accesses (reads and writes).

§file_name_handler: Arc<FileNameHandler>

Handler for file names used by the database.

§is_shutting_down: Arc<AtomicBool>

Field indicating if the database is shutting down.

§has_immutable_memtable: Arc<AtomicBool>

Field indicating if there is an immutable memtable.

An memtable is made immutable when it is undergoing the compaction process.

§compaction_worker: Arc<CompactionWorker>

The worker managing the compaction thread.

This is used to schedule compaction related tasks on a background thread.

§background_work_finished_signal: Arc<Condvar>

A condition variable used to notify parked threads that background work (e.g. compaction) has finished.

Implementations§

Source§

impl DB

Public methods

Source

pub fn open(options: DbOptions) -> RainDBResult<DB>

Open a database with the specified options.

Source

pub fn get_snapshot(&self) -> Snapshot

Get a handle to the current state of the database.

Get requests and iterators created with this snapshot will have a stable view of the database state. Callers must call DB::release_snapshot when the snapshot is no longer needed.

Source

pub fn release_snapshot(&self, snapshot: Snapshot)

Release a previously acquired snapshot.

Source

pub fn get( &self, read_options: ReadOptions, key: &[u8], ) -> RainDBResult<Vec<u8>>

Return the value stored at the specified key if it exists. Otherwise returns RainDBError::KeyNotFound.

Source

pub fn put( &self, write_options: WriteOptions, key: Vec<u8>, value: Vec<u8>, ) -> RainDBResult<()>

Set the provided key to the specified value.

Source

pub fn delete( &self, write_options: WriteOptions, key: Vec<u8>, ) -> RainDBResult<()>

Delete the specified key from the database.

The operation is considered successful even if the key does not exist in the database.

Source

pub fn apply( &self, write_options: WriteOptions, write_batch: Batch, ) -> RainDBResult<()>

Atomically apply a batch of changes to the database. The requesting thread is queued if there are multiple write requests.

This is the public API to the underlying DB::apply_changes method.

Source

pub fn new_iterator( &self, read_options: ReadOptions, ) -> RainDBResult<DatabaseIterator>

Returns an iterator over the contents of the database.

Source

pub fn destroy_database(options: DbOptions) -> RainDBResult<()>

Destroy the contents of the database. Be very careful using this method.

Source

pub fn compact_range(&self, key_range: Range<Option<&[u8]>>)

Compact the underlying storage for the key range specified.

This operation will remove deleted or overwritten versions for a key and will rearrange how data is stored in order to reduce the cost of operations for accessing the data.

None on either end of the key range will signify an open end to the range e.g. None at the start of the range will signify intent to compact all keys from the start of the database’s key range.

Source

pub fn get_descriptor( &self, descriptor: DatabaseDescriptor, ) -> RainDBResult<String>

Get a string describing the requested descriptor.

§Legacy

This is synonomous to LevelDB’s DB::GetProperty.

Source§

impl DB

Private methods

Source

fn wal(&self) -> &UnsafeCell<LogWriter>

Get a mutable reference to the write-ahead log.

§Safety

RainDB guarantees that there is only one thread that accesses the WAL log writer so giving out a mutable reference is fine.

Source

fn set_wal( &self, db_fields_guard: &mut MutexGuard<'_, GuardedDbFields>, new_wal_number: u64, wal_writer: LogWriter, )

Set WAL state fields to provided values.

Source

fn memtable(&self) -> Arc<Box<dyn MemTable>>

Get a shared reference to the memtable.

Source

fn generate_portable_state(&self) -> PortableDatabaseState

Generate portable database state.

Source

fn recover( &self, db_fields_guard: &mut MutexGuard<'_, GuardedDbFields>, ) -> RainDBResult<(VersionChangeManifest, bool)>

Recover database state from persistent storage. This method should only be called on database initialization.

This may do a significant amount of work to recover recently logged updates (e.g. in the WAL or a manifest file).

This method returns a tuple of a VersionChangeManifest with changes recovered from disk and a boolean set to true if recovery operations have changes to be saved.

§Panics

This method panics if the caller does not have a lock on the database.

Source

fn initialize_as_new_db(&self) -> RainDBResult<()>

Initialize fields and database structures for a new database.

§Legacy

This is synonomous to LevelDB’s DBImpl::NewDB.

Source

fn recover_unrecorded_logs( &self, db_fields_guard: &mut MutexGuard<'_, GuardedDbFields>, ) -> RainDBResult<(VersionChangeManifest, bool)>

Recover state from any WAL files that were not recorded to the manifest yet.

Newer WAL files may have been added in previous runs of the database without having been registered in the manifest file yet.

This method returns a tuple of a VersionChangeManifest with changes recovered from disk and a boolean set to true if recovery operations have changes to be saved.

Source

fn recover_wal_records( &self, db_fields_guard: &mut MutexGuard<'_, GuardedDbFields>, wal_number: u64, is_last_wal: bool, change_manifest: &mut VersionChangeManifest, ) -> RainDBResult<(bool, u64)>

Read and apply transactions recorded in the WAL to the database.

If is_last_wal is true, the method will attempt to reuse the WAL file.

This method will a tuple with a boolean set to true if recovery operations have changes to be saved and the maximum sequence number seen during log recovery.

§Legacy

This is synonomous to LevelDB’s DBImpl::RecoverLogFile.

Source

fn apply_changes( &self, write_options: WriteOptions, maybe_batch: Option<Batch>, ) -> RainDBResult<()>

Apply changes contained in the write batch. The requesting thread is queued if there are multiple write requests.

§Concurrency

All write activity should be coordinated through this thread. Any existing thread workers (e.g. CompactionWorker) or future thread worker types should not apply writes to the WAL or to the memtable. We should try to lock this down somehow but this is a design choice inherited from LevelDB.

§Group commits

Like LevelDB, RainDB may perform an extra level of batching on top of the batch already specified. If there are multiple threads making write requests, RainDB will queue the threads so that write operations are performed serially. In order to reduce request latency, RainDB will group batch requests on the queue up to a certain size limit and perform the requested writes together as if they were in the same Batch. We call this extra level of batching a group commit per the commit that added it in LevelDB.

§Legacy

This method is synonymous with DBImpl::Write in LevelDB.

Source

fn is_first_writer( &self, mutex_guard: &mut MutexGuard<'_, GuardedDbFields>, writer: &Arc<Writer>, ) -> bool

Check if the provided writer is the first writer in the writer queue.

Source

fn make_room_for_write( &self, mutex_guard: &mut MutexGuard<'_, GuardedDbFields>, force_compaction: bool, ) -> RainDBResult<()>

Ensures that there is room in the memtable for more writes and triggers a compaction if necessary.

  • force_compaction - This should usually be false. When true, this will force a compaction check of the memtable.
§Concurrency

The calling thread must be holding a lock to the guarded fields and the calling thread must be at the front of the writer queue. During the course of this method the lock may be released and reacquired.

Source

fn build_group_commit_batch( &self, mutex_guard: &mut MutexGuard<'_, GuardedDbFields>, ) -> RainDBResult<(Batch, Arc<Writer>)>

Build a Batch to execute as part of a group commit.

This method will return an error if the writer queue is empty or if the first writer does not have a batch. The first writer must have a batch because this method is for performing actual writes and we do not want to force a compaction and impact the latency of other writers in the batch.

Source

fn apply_batch_to_memtable(memtable: &dyn MemTable, batch: &Batch)

Apply the changes in the provided batch to the memtable.

Source

fn build_table_from_iterator<'m>( options: &DbOptions, metadata: &mut FileMetadata, iterator: Box<dyn RainDbIterator<Key = InternalKey, Error = RainDBError> + 'm>, table_cache: &Arc<TableCache>, ) -> RainDBResult<()>

Build a table file from the contents of a RainDbIterator.

The generated table file will be named after the provided table number. Upon successful table file generation, relevant fields of the the passed in FileMetadata will be filled in will metadata from the generated file.

If the passed in iterator is empty, a table file will not be generated and the file size field of the metadata struct will be set to zero.

Source

fn create_database_directories( fs: &Arc<dyn FileSystem>, file_name_handler: &FileNameHandler, db_path: &str, ) -> RainDBResult<()>

Create the directory structure that the database depends on.

Source

fn max_next_level_overlapping_bytes(&self) -> u64

For any level >= 1 and any file in the current version get the maximum number of bytes overlapping with next level.

Source§

impl DB

Crate-only methods

Source

pub(crate) fn set_bad_database_state( db_state: &PortableDatabaseState, mutex_guard: &mut MutexGuard<'_, GuardedDbFields>, catastrophic_error: RainDBError, )

Set field indicating that the database is in bad state and should not be written to.

§Legacy

This is synonomous to DBImpl::RecordBackgroundError in LevelDB.

Source

pub(crate) fn should_schedule_compaction( db_state: &PortableDatabaseState, mutex_guard: &mut MutexGuard<'_, GuardedDbFields>, ) -> bool

Return true if a compaction should be scheduled.

Various conditions are checked to see if a compaction is scheduled. For example, if the database is shutting down, a compaction will not be scheduled.

§Legacy

This is synonomous with LevelDB’s DBImpl::MaybeScheduleCompaction except that it only checks if a compaction should be scheduled. The caller will handle scheduling the compaction themselves.

Source

pub(crate) fn convert_memtable_to_file( db_state: &PortableDatabaseState, db_fields_guard: &mut MutexGuard<'_, GuardedDbFields>, memtable: Arc<Box<dyn MemTable>>, maybe_base_version: Option<&Arc<RwLock<Node<Version>>>>, change_manifest: &mut VersionChangeManifest, ) -> RainDBResult<()>

Convert the memtable to a table file.

§Legacy

This method is synonomous with LevelDB’s DBImpl::WriteLevel0Table. This was renamed to be more specific to its actual function of converting memtables to table files. It does not always place the generated file at level 0.

Source

pub(crate) fn set_current_file( filesystem_provider: Arc<dyn FileSystem>, file_name_handler: &FileNameHandler, manifest_file_number: u64, ) -> Result<()>

Set a new CURRENT file.

Source

pub(crate) fn remove_obsolete_files( db_fields_guard: &mut MutexGuard<'_, GuardedDbFields>, filesystem_provider: Arc<dyn FileSystem>, file_name_handler: &FileNameHandler, table_cache: &TableCache, )

Remove files that are no longer in use.

Source

fn get_all_db_files(&self) -> RainDBResult<Vec<PathBuf>>

Return a flattened list of paths to all files under the database root.

Source

fn force_memtable_compaction(&self) -> RainDBResult<()>

Force the compaction of the current memtable.

§Legacy

This method is synonomous to LevelDB’s DBImple::TEST_CompactMemTable method.

Source

fn force_level_compaction(&self, level: usize, key_range: &Range<Option<&[u8]>>)

Force the compaction of the specified level for the specified user key range.

§Panics

This method will panic if it is given an invalid level. The level provided cannot be the last level because there is no next level to compact to.

§Legacy

This method is synonomous to LevelDB’s DBImple::TEST_CompactRange method.

Source

fn summarize_compaction_stats(&self) -> String

Return a string summarizing the compaction statistics for each level.

Trait Implementations§

Source§

impl Drop for DB

Source§

fn drop(&mut self)

Executes the destructor for this type. Read more

Auto Trait Implementations§

§

impl !Freeze for DB

§

impl !RefUnwindSafe for DB

§

impl Send for DB

§

impl Sync for DB

§

impl Unpin for DB

§

impl !UnwindSafe for DB

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V