Expand description
An implementation of the A-Tree: A Dynamic Data Structure for Efficiently Indexing Arbitrary Boolean Expressions paper.
§Examples
Searching for some boolean expressions:
use a_tree::{ATree, AttributeDefinition};
use std::collections::HashMap;
// Create the A-Tree
let mut atree = ATree::new(&[
AttributeDefinition::string_list("deal_ids"),
AttributeDefinition::integer("exchange_id"),
AttributeDefinition::boolean("debug"),
AttributeDefinition::integer_list("segment_ids"),
]).unwrap();
let expression_1 = r#"deal_ids one of ["deal-1", "deal-2"]"#;
let expression_2 = r#"segment_ids one of [1, 2, 3, 4]"#;
// Insert the arbitrary boolean expressions
let expressions_by_ids = vec![
(1u64, expression_1),
(2u64, expression_2)
];
let mut mappings: HashMap<u64, &str> = HashMap::new();
for (id, expression) in &expressions_by_ids {
atree.insert(*id, expression).unwrap();
mappings.insert(*id, expression);
}
// Create an event
let mut builder = atree.make_event();
builder.with_string_list("deal_ids", &["deal-2"]).unwrap();
builder.with_integer_list("segment_ids", &[1, 2]).unwrap();
builder.with_boolean("debug", false).unwrap();
let event = builder.build().unwrap();
// Search for matching boolean expressions
let report = atree.search(event).unwrap();
report.matches().iter().for_each(|id| {
println!(r#"Found ID: {id}, Expression: "{}""#, mappings[id]);
});
§Domain Specific Language (DSL)
The A-Tree crate support a DSL to allow easy creation of arbitrary boolean expressions. The following operators are supported:
- Boolean operators:
and
(&&
),or
(||
),not
(!
) andvariable
wherevariable
is a defined attribute for the A-Tree; - Comparison:
<
,<=
,>
,>=
. They work forinteger
andfloat
; - Equality:
=
and<>
. They work forinteger
,float
andstring
; - Null:
is null
,is not null
(for variables) andis empty
(for lists); - Set:
in
andnot in
. They work for list ofinteger
or for list ofstring
; - List:
one of
,none of
andall of
. They work for list ofinteger
and list ofstring
.
As an example, the following would all be valid arbitrary boolean expressions (ABE):
(exchange_id = 1 and deals one of ["deal-1", "deal-2", "deal-3"]) and (segment_ids one of [1, 2, 3]) and (continent = 'NA' and country in ["US", "CA"])
(log_level = 'debug') and (month in [1, 2, 3] and day in [15, 16]) or (month in [4, 5, 6] and day in [10, 11])
§Optimizations
The A-Tree is a data structure that can efficiently search a large amount of arbitrary boolean expressions for ones that match a specific event. To achieve this, there are a certain amount of things that the A-Tree will do:
- Search for duplicated intermediary boolean expressions nodes (i.e. if there are two
expressions such as
(A ∧ (B ∧ C))
and(D ∨ (B ∧ C))
, the tree will find the common sub-expression(B ∧ C)
and will make both expression refer to the common node). - Convert the strings to IDs to accelerate comparison and search;
- Sort the lists and remove duplicates.
Structs§
- The A-Tree data structure as described by the paper
- The definition of an attribute that is usable by the
crate::atree::ATree
- An event that can be used by the
crate::atree::ATree
structure to match arbitrary boolean expressions - An
Event
builder - Structure that holds the search results from the
ATree::search()
function