Expand description
Provides an in-memory database which can be used for ultra fast lookups on static master data.
A lot of business related applications need lots of master data like to operate correctly. Most of this data is rather static and large enough to not be “just loaded into the app itself” but also small enough to remain in memory in a dedicated server.
Examples would be code lists and mappings along with translations like a list of all packaging units along with mappings for different standards / file formats or a list of customs declaration codes.
Basically these datasets are lists of documents as one knows from MongoDB or other “NoSQL” databases but IDB provides some distinct features for lookups, reverse lookup, searching or multi language handling.
Next to these datasets / code lists (which are internally referred to as “tables”) InfoGraphDB also supports “sets” of strings. These sets can be used to efficiently represents lists like “which codes are enabled for standard X” or “which table entries can be used in scenario Y”.
Managing data
The data stored in IDB is considered “static”. Therefore it is loaded once from a source and then cannot be modified anymore. A common source for data is the Repository which can use its Loaders to transform an input file into one or more tables in IDB. Of course, if the underlying file changes, the file will be re-read and the table will be replaced, IDB just doesn’t provide any direct way of manipulating the data.
The main idea is to use a common data source like a git repository or a bucket in an object store which contains the master data as Yaml, JSON, XML or other formats. Using the Repository commands, this data is then loaded into the database. This has the great benefit that all systems (development, staging, production, customer instances …) will be automatically update once a file is changed.
By default, sets are loaded via the idb-yaml-sets loader, which expects a YAML hash which maps one or more keys to lists of set entries like this:
my_set: [ "A", "B", "C" ]
some.other.set: [ "foo", "bar" ]
Of course, there is also a programmatic API to create or drop tables and sets form other sources.
Commands
- IDB.LOOKUP:
IDB.LOOKUP table search_path filter_value path1 path2 path3
Performs a lookup for the given filter value in the given search path (inner fields separated by “.”) within the given table. If a result is found, the values for path1..pathN are extracted and returned. If no path is given, the number of matches is returned. If multiple documents match, only the first one if returned. Note that if a path matches an inner object (which is especially true for “.”), the result will be wrapped as JSON. Note thatIDB.LOOKUP
is case sensitive by default. However, if a fulltext index is placed on the field being queried, a case insensitive lookup can be performed if the given filter_value is already lowercase. This might be used e.g. for reverse lookups to find a code for a given text in a certain (or any) language. - IDB.ILOOKUP:
IDB.ILOOKUP table primary_lang fallback_lang search_path filter_value path1
Behaves just likeIDB.LOOKUP
. However, of one of the given extraction paths points to an inner map, we expect this to be a map of translation where we first try to find the value for the primary language and if none is found for the fallback language. Note that, if both languages fail to yield a value, we attempt to resolve a final fallback using xx as language code. If all these attempts fail, we output an empty string. Note that therefore it is not possible to return an inner map when using ILOOKUP which is used for anything other than translations. Note however, that extracting single values using a proper path still works. SeeIDB.LOOKUP
for details when this is case-sensitive and when it isn’t. - IDB.QUERY:
IDB.QUERY table num_skip max_results search_path filter_value path1
Behaves just like lookup, but doesn’t just return the first result, but skips over the firstnum_skip
results and then outputs up tomax_result
rows. Not that this is again limited to at most 1000. SeeIDB.LOOKUP
for details when this is case-sensitive and when it isn’t. - IDB.IQUERY:
IDB.QUERY table primary_lang fallback_lang num_skip max_results search_path filter_value path1
Provides essentially the same i18n lookups forIDB.QUERY
asIDB.ILOOKUP
does forIDB.LOOKUP
. SeeIDB.LOOKUP
for details when this is case-sensitive and when it isn’t. - IDB.SEARCH:
IDB.SEARCH table num_skip max_results search_paths filter_value path1
Performs a search in all fields given assearch_paths
. This can either be comma separated like “path1,path2,path3” or a “*” to select all fields. Note that for a given search value, this will match case-insensitive and also for prefixes of a detected word within the document (the selected fields). Everything else behaves just likeIDB.QUERY
. Also note that a fulltext index has to be present for each field being queried. - IDB.ISEARCH:
IDB.ISEARCH table primary_lang fallback_lang num_skip max_results search_paths filter_value path1
Adds i18n lookups for the generated results just likeIDB.IQUERY
orIDB.ILOOKUP
. - IDB.SCAN:
IDB.SCAN table num_skip max_results path1 path2 path3
Outputs all results by skipping over the firstnum_skip
entries in the table and then outputting up tomax_results
rows. - IDB.ISCAN:
IDB.ISCAN table primary_lang fallback_lang num_skip max_results path1 path2 path3
Again, behaves just likeIDB.SCAN
but provides i18n lookup for the given languages. - IDB.LEN:
IDB.LEN
reports the size of the given table. - IDB.SHOW_TABLES:
IDB.SHOW_TABLES
reports all tables and their usage statistics. - IDB.SHOW_SETS:
IDB.SHOW_SETS
reports all sets and their usage statistics. - IDB.CONTAINS:
IDB.CONTAINS set key1 key2 key3
reports if the given keys are contained in the given set. For each key a 1 (contained) or a 0 (not contained) will be reported. - IDB.INDEX_OF:
IDB.INDEX_OF set key1 key2 key3
reports the insertion index for each of the given keys using one-based indices. - IDB.CARDINALITY:
IDB.CARDINALITY set
reports the size of the given set.
Example
Imagine we have the following super simplified dataset representing some countries:
code: "D"
iso:
two: "de"
three: "deu"
name:
de: "Deutschland"
en: "Germany"
---
code: "A"
iso:
two: "at"
three: "aut"
name:
de: "Österreich"
en: "Austria"
Executing IDB.LOOKUP countries code D iso.two
would yield “de”. We could also use
IDB.ILOOKUP countries de en code D iso.two name
to retrieve “de”, “Deutschland” or
IDB.LOOKUP countries name.de Deutschland code
for a reverse lookup yielding “D” again.
Note that even IDB.LOOKUP countries name Deutschland code
would work here, as we
index all translations for a field.
Note that if both languages given in an IDB.ILOOKUP
(or IDB.ISEARCH
, IDB.ISCAN
) don’t
yield a value, we check if a final fallback value for the code xx is present. This might
be usable if there is a default value and only one or a few languages differ.
Also note that the whole element being matched can be requested when using “.” as field path,
e.g. IDB.LOOKUP countries code D .
. A this matches an inner object, this will return the whole
element wrapped as JSON string.
We could invoke IDB.ISEARCH countries en de 0 5 * deutsch code name
to retrieve “D”, “Germany”
e.g. to provide autocomplete values for a country field.
Modules
- Imports CSV files into InfoGraph DB.
- Imports JSON files into InfoGraph DB.
- Imports YAML files into InfoGraph DB.
- Imports YAML files as sets into InfoGraphDB.
- Represents a Set as used by InfoGraphDB.
- A table wraps a crate::ig::docs::Doc together with an Trie as index to support InfoGraphDB queries.
- Provides a lookup table for arbitrary data structures using a TRIE.
Structs
- Describes the public API of the database.
Enums
- Describes the administrative commands which can be submitted via Database::perform.
Functions
- Installs an actor which handles the commands as described above.