max_tree/lib.rs
1#![deny(missing_docs)]
2
3//! # Max Tree
4//!
5//! A utility maximizer library based on a maximum tree structure.
6//!
7//! ### Motivation
8//!
9//! Make it easier to compare and compose algorithms for utility programming.
10//!
11//! ### Example: Moon Landing
12//!
13//! This example is used to improve and push the limits of library.
14//! Why? Because it is fun!
15//!
16//! First make it work, then gradually make the model more realistic over time.
17//!
18//! - Source: examples/moon.rs
19//! - Description: An experiment to use a greedy optimizer to land a spaceship on the Moon from Earth.
20//! - Status: Simplified model works (without crashing), missing lots of features.
21//! - PRs: Welcome!
22//!
23//! 
24//!
25//! **If you can make AI land a spaceship on the Moon, then what else is possible?**
26//!
27//! ### Usage of this library
28//!
29//! *Notice! This code is for research purposes only and should NEVER be used in system critical applications!*
30//!
31//! **Warning! Improper usage of this library can lead to unsafe AI behavior.**
32//!
33//! All algorithms that support general unrestricted classical utility theory are unsafe AI designs.
34//! In particular, for ASI (Artificial Super Intelligence) design,
35//! this library is extremely unsafe without safety verification before use.
36//! For more information about safe ASI core design, see [asi_core0](https://github.com/advancedresearch/asi_core0).
37//!
38//! That said, have fun!
39//!
40//! ### Introduction
41//!
42//! A maximum tree stores the maximum utility of node or children for every node.
43//! It is a convenient data structure for comparing or composing different search algorithms.
44//!
45//! Once a maximum tree is constructed, searching for the best course of action is trivial.
46//! Since each node stores maximum utility, it is easy to compare children and decide what to do.
47//!
48//! - A leaf stores the utility as maximum utility
49//! - A node is terminal if it stores higher maximum utility than its children
50//! - A node's utility is forgotten (overriden) when a child has higher utility
51//!
52//! ### How to use this library
53//!
54//! This library contains only a minimum set of features,
55//! intended to be used as a core for more advanced custom algorithms:
56//!
57//! - `Ai::full` does a complete search, finding global maximum
58//! - `Ai::greedy` does a local search, finding local maximum
59//! - `Ai::sub_breadth` constructs children for every available action
60//!
61//! The `full` and `greedy` algorithms assumes determinism and perfect information in context.
62//! Basically, it means they should only be used in simulations or controlled environments.
63//!
64//! The `Ai::sub_breadth` is used as a common sub-procedure for several algorithms.
65//!
66//! For non-determinism, the maximum utility becomes maximum expected utility.
67//! This requires constructing the maximum tree with custom algorithms.
68//! For more information, see "Custom algorithms" below.
69//!
70//! ### Differences from reward accumulation
71//!
72//! A maximum tree does not accumulate rewards over actions.
73//! This means that only the final reward is optimized.
74//!
75//! However, it possible to simulate accumulated rewards.
76//!
77//! To accumulate rewards, one can use the node data to store utility.
78//! Just add the reward to accumulated rewards so far.
79//! The accumulated rewards are stored as maximum utility.
80//!
81//! Optimization for final reward has special terminal semantics.
82//! For more information, see "Terminal semantics" below.
83//!
84//! ### Discounting action depth
85//!
86//! By default, more steps to complete the goal is not penalized.
87//!
88//! Subtracting a tiny amount of utility proportional to depth
89//! will make the algorithm prioritize fewer steps to reach the goal.
90//!
91//! Since this is common behavior, one can activate this by setting
92//! `AiSettings::eps_depth` to e.g. `0.0000001`.
93//!
94//! ### Custom algorithms
95//!
96//! When the algorithms that are included with this library are too limiting,
97//! it is possible to write custom algorithms that constructs the maximum tree
98//! in other ways or performs different kinds of analysis.
99//!
100//! The maximum tree is designed to be convenient for composing different search algorithms.
101//!
102//! One can perform e.g. posterior safety analysis without side effects in the context.
103//!
104//! It is also possible to restore state of the context and continue search from any node,
105//! using a different search algorithm than the one used to construct the tree.
106//! The final maximum tree can be used with any analysis algorithm.
107//!
108//! Under non-determinism or hidden states in the context,
109//! the semantics of maximum utility changes slightly.
110//! This means that one can not expect soundness when composing algorithms
111//! unless the intersecting semantics is sound when exploring a sub-branch.
112//! It is not necessary that the intersecting semantics hold in general,
113//! but it must hold for the particular usage in the application.
114//!
115//! The most common use case of this library is for contexts where
116//! there is perfect information and undoing changes restores the environment perfectly.
117//! In most applications, this means simulating the entire world where the AI operates.
118//!
119//! ### Terminal semantics
120//!
121//! Under verification for safety, evaluation of the terminal semantics must be included.
122//! This library is not safe to use when the terminal semantics of a given application
123//! has been not been verified for safety.
124//!
125//! When a node is terminal, which is the case for any global maximum
126//! that do not have any children with equal maximum utility,
127//! one must pay careful attention to the semantics of achieving that goal.
128//!
129//! In the sense that a global maximum is rearched,
130//! it makes no sense to do so if e.g. the world ends and there nothing left to do.
131//!
132//! Reaching infinite utility in an infinitesimal of time does not
133//! correspond to a [Zen Rational](https://github.com/advancedresearch/path_semantics/blob/master/ai-sequences.md#zen-rationality)
134//! human intuition about meaningful goals.
135//! The reason for this is that when the true goal among many possible goals is uncertain,
136//! one risks excluding the true goal with high probability by optimizing for a single goal.
137//! Instead, according to higher order utilitariansim, one should optimize for a cluster of goals
138//! where each goal is reachable from any other.
139//! For more information, see [Groupoid Assumption of Multi-Goal Optimization](https://github.com/advancedresearch/path_semantics/blob/master/papers-wip/groupoid-assumption-of-multi-goal-optimization.pdf).
140//!
141//! Although classical utility theory can be used to achieve a single goal,
142//! this does not guarantee that achieving the goal is meaningful.
143//! This is only the case if and only if the true goal is reachable from the achieved goal.
144//! A true goal is defined in [Naive Zen Logic](https://github.com/advancedresearch/path_semantics/blob/master/papers-wip/naive-zen-logic.pdf) as:
145//!
146//! ```text
147//! true_goal(X) = (goal(X) ? .me) ? me
148//! ```
149//!
150//! It means, the true goal is the goal I believe I would have if I (the AI agent) were smarter.
151//!
152//! If the AI agent achieves global maximum and then self-improve,
153//! in hindsight it was only meaningful to reach global maximum if and only if the new goal is reachable.
154//! Therefore, achieving global maximum is meaningful if and only if the true goal is reachable
155//! from the state of global maximum.
156//!
157//! Accumulated rewards can cloud the judgement about terminal semantics.
158//! With accumulated rewards, there is nothing that predicts termination
159//! when the expected utility from the environment is uncertain.
160//! As long as there exists some non-terminating state with positive utility,
161//! there exists a course of action that might increase utility.
162//! Therefore, most AI agents who optimize accumulated rewards do not need to
163//! reason about the terminal semantics in the same way that AI agents that optimizes for final rewards.
164//! However, under self-improvement, accumulated rewards also requires higher order reasoning for safety.
165//!
166//! Since optimizing for the final reward is a strict superset of
167//! optimizing for accumulated rewards, the terminal semantics in the first case
168//! is a strict superset of the terminal semantics of the latter.
169//!
170//! One can include a term in the final reward that estimates future potential.
171//! If the term for future potential can be negative, then excluding it will lead to unsafety.
172//!
173//! ## License
174//!
175//! Licensed under either of
176//! * Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
177//! * MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)
178//! at your option.
179//!
180//! ### Contribution
181//!
182//! Unless you explicitly state otherwise, any contribution intentionally submitted
183//! for inclusion in the work by you shall be dual licensed as above, without any
184//! additional terms or conditions.
185
186/// Reexports commonly used objects.
187pub mod prelude {
188 pub use super::{Ai, AiAnalysis, AiSettings, Node};
189}
190
191/// Stores action node (represented as a maximum tree).
192///
193/// Each node stores a maximum utility of itself or any children.
194///
195/// A terminal node has higher utility than any other children.
196#[derive(Debug)]
197pub struct Node<T, A> {
198 /// Stores maximum utility of itself or any children.
199 pub max: f64,
200 /// Stores node data.
201 pub data: T,
202 /// Stores child nodes.
203 ///
204 /// Each child has an associated action.
205 /// The associated action identifies the node.
206 /// This means the action should be unique among the children.
207 /// This invariant is enforced by trusted search algorithms.
208 /// Use `check_unique_actions` when the input is not trusted.
209 pub children: Vec<(A, Node<T, A>)>,
210}
211
212impl<T, A> Node<T, A> {
213 /// Creates a new root.
214 ///
215 /// This sets the utility to `NaN` (not a number).
216 /// There are no children, which must be added through search.
217 pub fn root(data: T) -> Node<T, A> {
218 Node {
219 max: std::f64::NAN,
220 data,
221 children: vec![]
222 }
223 }
224
225 /// Returns `true` if all actions among children are unique.
226 ///
227 /// This algorithm does not provide any proof of the collision among children,
228 /// since this use case is uncommon (the invariant is enforced by search algorithms).
229 pub fn check_unique_actions(&self) -> bool
230 where A: Eq + std::hash::Hash
231 {
232 use std::collections::HashSet;
233
234 let mut hash_set = HashSet::new();
235 for &(ref a, _) in &self.children {
236 if hash_set.contains(a) {return false}
237 hash_set.insert(a.clone());
238 }
239 true
240 }
241
242 /// Returns `true` if the node is terminal.
243 ///
244 /// A terminal node has no children with equal or greater utility.
245 /// This means that without paying attention to terminal semantics,
246 /// an unsafe AI design can lead to a "stranded" situation.
247 /// For more information, see the section "Terminal semantics"
248 /// at this crate's top level documentation.
249 pub fn terminal(&self) -> bool {self.optimal().is_none()}
250
251 /// Returns the optimal course of action, if any.
252 ///
253 /// Returns `None` if no children has higher utility.
254 /// This means the node is terminal.
255 pub fn optimal(&self) -> Option<usize> {
256 for (i, ch) in self.children.iter().enumerate() {
257 if ch.1.max >= self.max {return Some(i)}
258 }
259 None
260 }
261
262 /// Returns optimal path from root.
263 pub fn optimal_path(&self) -> Vec<usize> {
264 let mut node = self;
265 let mut res = vec![];
266 loop {
267 if let Some(i) = node.optimal() {
268 node = &node.children[i].1;
269 res.push(i);
270 } else {
271 break;
272 }
273 }
274 res
275 }
276}
277
278/// AI settings.
279pub struct AiSettings {
280 /// Maximum depth.
281 pub max_depth: usize,
282 /// Utility discount from action depth.
283 ///
284 /// This is usually a small positive number (e.g. `0.000001`).
285 pub eps_depth: f64,
286 /// Whether to run analysis.
287 pub analysis: bool,
288 /// Eliminate unexplored actions when using greedy search.
289 pub greed_elim: bool,
290 /// A limit to estimated memory usage,
291 /// causing the search to terminate.
292 ///
293 /// This limit is only checked occationally, e.g. after breadth search,
294 /// so actual memory usage before termination will exceed limit.
295 pub max_mib: Option<f64>,
296}
297
298impl AiSettings {
299 /// Creates new settings.
300 pub fn new(max_depth: usize, eps_depth: f64) -> AiSettings {
301 AiSettings {
302 max_depth,
303 eps_depth,
304 analysis: false,
305 greed_elim: true,
306 max_mib: None,
307 }
308 }
309}
310
311/// Stores results from analysis.
312pub struct AiAnalysis {
313 /// Keeps track of maximum number of nodes.
314 pub node_count: usize,
315}
316
317impl AiAnalysis {
318 /// Creates new AI analysis.
319 pub fn new() -> AiAnalysis {
320 AiAnalysis {
321 node_count: 0,
322 }
323 }
324}
325
326impl AiAnalysis {
327 /// Estimates the maximum memory usage of nodes in Gibibytes.
328 pub fn gib(&self, node_size: usize) -> f64 {
329 (self.node_count as f64 * node_size as f64) / 1073741824.0
330 }
331
332 /// Estimates the maximum memory usage of nodes in Mibibytes.
333 pub fn mib(&self, node_size: usize) -> f64 {
334 (self.node_count as f64 * node_size as f64) / 1048576.0
335 }
336
337 /// Estimates the maximum memory usage of nodes in Kibibytes.
338 pub fn kib(&self, node_size: usize) -> f64 {
339 (self.node_count as f64 * node_size as f64) / 1024.0
340 }
341}
342
343/// AI setup.
344///
345/// Provides a common setup for different search algorithms.
346/// The search algorithm constructs a maximum tree,
347/// which can be used to find the optimal course of action.
348///
349/// The `T` parameter is the type of node data.
350/// This is used to store delta changes for undo operations.
351/// Also stores data that tracks internal state of the AI agent,
352/// when the AI agent is not interacting with the context (pure search).
353/// Node data is stored separately from the context, making it easy
354/// to analyse after constructing the maximum tree.
355///
356/// The `A` parameter is the type of action.
357/// It describes the choices the AI agent can make in a specific context.
358/// An action modifies the context and must be undone when rolling back changes.
359///
360/// The `C` parameter is the type of context (environment).
361/// This stores the data that is necessary to calculate utility.
362/// The context is modified by actions when exploring,
363/// but these changes are undone when rolling back changes.
364pub struct Ai<T, A, C> {
365 /// Calculates utility from data and context.
366 pub utility: fn(&T, &C) -> f64,
367 /// Returns a list of possible actions.
368 pub actions: fn(&T, &C) -> Vec<A>,
369 /// Executes an action, returning new node data.
370 pub execute: fn(&T, a: &A, &mut C) -> Result<T, ()>,
371 /// Undoes change made to context.
372 ///
373 /// The data required to rollback delta changes
374 /// must be stored in node data.
375 pub undo: fn(&T, &mut C),
376 /// Stores AI settings.
377 pub settings: AiSettings,
378 /// Stores analysis.
379 pub analysis: AiAnalysis,
380}
381
382impl<T, A, C> Ai<T, A, C> {
383 /// Computes the size of nodes in bytes.
384 pub fn node_size(&self) -> usize {
385 std::mem::size_of::<Node<T, A>>()
386 }
387
388 /// Calculates utility with extra terms computed from settings.
389 pub fn utility_with_settings(&self, data: &T, depth: usize, ctx: &C) -> f64 {
390 let utility = (self.utility)(data, ctx);
391 let discount_step = -self.settings.eps_depth * depth as f64;
392 utility + discount_step
393 }
394
395 /// Updates context by tracing the optimal path.
396 pub fn update<'a>(&mut self, node: &'a Node<T, A>, ctx: &mut C) -> Option<usize> {
397 if let Some(i) = node.optimal() {
398 if (self.execute)(&node.data, &node.children[i].0, ctx).is_ok() {
399 Some(i)
400 } else {
401 None
402 }
403 } else {
404 None
405 }
406 }
407
408 /// A sub-procedure constructing maximum tree of all available actions.
409 ///
410 /// Uses by other search algorithms.
411 pub fn sub_breadth(&mut self, root: &mut Node<T, A>, depth: usize, ctx: &mut C)
412 where A: Clone
413 {
414 root.children.clear();
415 let actions = (self.actions)(&root.data, ctx);
416 for a in &actions {
417 if let Ok(data) = (self.execute)(&root.data, a, ctx) {
418 let utility = self.utility_with_settings(&data, depth + 1, ctx);
419 if utility > root.max {
420 root.max = utility;
421 }
422
423 // Undo changes made to context to reset state.
424 (self.undo)(&data, ctx);
425
426 root.children.push((a.clone(), Node {
427 max: utility,
428 data,
429 children: vec![],
430 }));
431
432 if self.settings.analysis {
433 self.analysis.node_count += 1;
434 }
435 }
436 }
437 }
438
439 /// Returns `true` when estimated memory usage is exceeded, `false` otherwise.
440 ///
441 /// Returns `false` when analysis is deactivated.
442 pub fn memory_exceeded(&self) -> bool {
443 if self.settings.analysis {
444 if let Some(limit) = self.settings.max_mib {
445 self.analysis.mib(self.node_size()) >= limit
446 } else {false}
447 } else {false}
448 }
449
450 /// Only picks choices that increases utility.
451 ///
452 /// In order to find global maximum, it requires utility gradient to be convex.
453 pub fn greedy(&mut self, root: &mut Node<T, A>, depth: usize, ctx: &mut C)
454 where A: Clone
455 {
456 if root.max.is_nan() {
457 root.max = self.utility_with_settings(&root.data, depth, ctx);
458 }
459
460 self.sub_breadth(root, depth, ctx);
461
462 if depth >= self.settings.max_depth {return};
463 if self.memory_exceeded() {return};
464
465 if let Some(i) = root.optimal() {
466 let i = if self.settings.greed_elim {
467 if self.settings.analysis {
468 self.analysis.node_count -= root.children.len() - 1;
469 }
470 root.children.swap(i, 0);
471 root.children.truncate(1);
472 0
473 } else {i};
474
475 let a = &root.children[i].0;
476 if let Ok(_) = (self.execute)(&root.data, a, ctx) {
477 let ch = &mut root.children[i].1;
478 self.greedy(ch, depth + 1, ctx);
479
480 // Undo changes made to context to reset state.
481 (self.undo)(&ch.data, ctx);
482
483 // Update maximum utility since children are changed.
484 if ch.max > root.max {
485 root.max = ch.max;
486 }
487 }
488 }
489 }
490
491 /// Performs a full construction of the entire maximum tree.
492 pub fn full(&mut self, root: &mut Node<T, A>, depth: usize, ctx: &mut C)
493 where A: Clone
494 {
495 if root.max.is_nan() {
496 root.max = self.utility_with_settings(&root.data, depth, ctx);
497 }
498
499 self.sub_breadth(root, depth, ctx);
500
501 if depth >= self.settings.max_depth {return};
502 if self.memory_exceeded() {return};
503
504 for (ref a, ref mut ch) in &mut root.children {
505 if let Ok(_) = (self.execute)(&root.data, a, ctx) {
506 self.full(ch, depth + 1, ctx);
507
508 // Undo changes made to context to reset state.
509 (self.undo)(&ch.data, ctx);
510
511 // Update maximum utility since children are changed.
512 if ch.max > root.max {
513 root.max = ch.max;
514 }
515 }
516 }
517 }
518}
519
520#[cfg(test)]
521mod tests {
522 #[test]
523 fn it_works() {
524 assert_eq!(2 + 2, 4);
525 }
526}