Crate bstree_file_readonly

Source
Expand description

Library implementing a read-only binary search tree stored in a file.

Entries are tuples made of one identifier plus one value. Searches are performed on the values: they typically are values stored in a table column while identifiers are rows identifiers (e.g. simple row indices).

Node size should be such that it fits into L1 data cache. Typical values (for cache core):

  • L1 cache: 32 kB
  • L2 cache: 256 kB
  • L3 cache: 1 MB (6 MB shared between 4 cores)
  • HDD cache: 8 MB I.e: L2 = 8 * L1; L3 = 4 * L2 = 32 * L1 I.e: HDD = 256 * L1

In DBMS, the page size is <= L1 cache size

If designed for HDD, we want to avoid random access (5ms seek time):

  • we thus prefer to load large parts of the file in RAM
    • we favor a single root node (kept in cache), and an array of leaf nodes
  • each node stored on the disk must be divided into sub-nodes no larger than the L1 cache capacity

If you are looking for a index that can be modified (add and remove key/value pair), you should have a look at sled and its git repo. They cite this paper I should definitely have a look at it!!

Modules§

bstree
See the tree terminology here: https://en.wikipedia.org/wiki/Tree_(data_structure)
cliargs
Command Line Interface arguments
float
Implementation of the Ord trait on finite float.
mk
This module contains the main code able to build and store in a file a bs-tree.
rw
visitors
Visitors are the structures allowing ot perform queries on the binary-search tree.

Structs§

Entry
EntryOpt
EntryRaw
IdVal
RawEntries

Enums§

IdInMemType
IdType
ValInMemType
ValType

Traits§

FromU64
Id
Trait defining the minimum requirements to be an identifier
Process
Defines an action which has to read and/or write given identifier and value types. It is made to be used with the IdVal type. The reason behind is that IdVal will contains the giant match for all possible (IdType, ValType) tuples.And we want to write this match only once!
Val
Trait defining the minimum requirements to be a value