1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
#![deny(missing_docs)]

//! # Max Tree
//!
//! A utility maximizer library based on a maximum tree structure.
//!
//! ### Motivation
//!
//! Make it easier to compare and compose algorithms for utility programming.
//!
//! ### Example: Moon Landing
//!
//! This example is used to improve and push the limits of library.
//! Why? Because it is fun!
//!
//! First make it work, then gradually make the model more realistic over time.
//!
//! - Source: examples/moon.rs
//! - Description: An experiment to use a greedy optimizer to land a spaceship on the Moon from Earth.
//! - Status: Simplified model works (without crashing), missing lots of features.
//! - PRs: Welcome!
//!
//! ![The Moon](https://upload.wikimedia.org/wikipedia/commons/thumb/e/e1/FullMoon2010.jpg/800px-FullMoon2010.jpg)
//!
//! **If you can make AI land a spaceship on the Moon, then what else is possible?**
//!
//! ### Usage of this library
//!
//! *Notice! This code is for research purposes only and should NEVER be used in system critical applications!*
//!
//! **Warning! Improper usage of this library can lead to unsafe AI behavior.**
//!
//! All algorithms that support general unrestricted classical utility theory are unsafe AI designs.
//! In particular, for ASI (Artificial Super Intelligence) design,
//! this library is extremely unsafe without safety verification before use.
//! For more information about safe ASI core design, see [asi_core0](https://github.com/advancedresearch/asi_core0).
//!
//! That said, have fun!
//!
//! ### Introduction
//!
//! A maximum tree stores the maximum utility of node or children for every node.
//! It is a convenient data structure for comparing or composing different search algorithms.
//!
//! Once a maximum tree is constructed, searching for the best course of action is trivial.
//! Since each node stores maximum utility, it is easy to compare children and decide what to do.
//!
//! - A leaf stores the utility as maximum utility
//! - A node is terminal if it stores higher maximum utility than its children
//! - A node's utility is forgotten (overriden) when a child has higher utility
//!
//! ### How to use this library
//!
//! This library contains only a minimum set of features,
//! intended to be used as a core for more advanced custom algorithms:
//!
//! - `Ai::full` does a complete search, finding global maximum
//! - `Ai::greedy` does a local search, finding local maximum
//! - `Ai::sub_breadth` constructs children for every available action
//!
//! The `full` and `greedy` algorithms assumes determinism and perfect information in context.
//! Basically, it means they should only be used in simulations or controlled environments.
//!
//! The `Ai::sub_breadth` is used as a common sub-procedure for several algorithms.
//!
//! For non-determinism, the maximum utility becomes maximum expected utility.
//! This requires constructing the maximum tree with custom algorithms.
//! For more information, see "Custom algorithms" below.
//!
//! ### Differences from reward accumulation
//!
//! A maximum tree does not accumulate rewards over actions.
//! This means that only the final reward is optimized.
//!
//! However, it possible to simulate accumulated rewards.
//!
//! To accumulate rewards, one can use the node data to store utility.
//! Just add the reward to accumulated rewards so far.
//! The accumulated rewards are stored as maximum utility.
//!
//! Optimization for final reward has special terminal semantics.
//! For more information, see "Terminal semantics" below.
//!
//! ### Discounting action depth
//!
//! By default, more steps to complete the goal is not penalized.
//!
//! Subtracting a tiny amount of utility proportional to depth
//! will make the algorithm prioritize fewer steps to reach the goal.
//!
//! Since this is common behavior, one can activate this by setting
//! `AiSettings::eps_depth` to e.g. `0.0000001`.
//!
//! ### Custom algorithms
//!
//! When the algorithms that are included with this library are too limiting,
//! it is possible to write custom algorithms that constructs the maximum tree
//! in other ways or performs different kinds of analysis.
//!
//! The maximum tree is designed to be convenient for composing different search algorithms.
//!
//! One can perform e.g. posterior safety analysis without side effects in the context.
//!
//! It is also possible to restore state of the context and continue search from any node,
//! using a different search algorithm than the one used to construct the tree.
//! The final maximum tree can be used with any analysis algorithm.
//!
//! Under non-determinism or hidden states in the context,
//! the semantics of maximum utility changes slightly.
//! This means that one can not expect soundness when composing algorithms
//! unless the intersecting semantics is sound when exploring a sub-branch.
//! It is not necessary that the intersecting semantics hold in general,
//! but it must hold for the particular usage in the application.
//!
//! The most common use case of this library is for contexts where
//! there is perfect information and undoing changes restores the environment perfectly.
//! In most applications, this means simulating the entire world where the AI operates.
//!
//! ### Terminal semantics
//!
//! Under verification for safety, evaluation of the terminal semantics must be included.
//! This library is not safe to use when the terminal semantics of a given application
//! has been not been verified for safety.
//!
//! When a node is terminal, which is the case for any global maximum
//! that do not have any children with equal maximum utility,
//! one must pay careful attention to the semantics of achieving that goal.
//!
//! In the sense that a global maximum is rearched,
//! it makes no sense to do so if e.g. the world ends and there nothing left to do.
//!
//! Reaching infinite utility in an infinitesimal of time does not
//! correspond to a [Zen Rational](https://github.com/advancedresearch/path_semantics/blob/master/ai-sequences.md#zen-rationality)
//! human intuition about meaningful goals.
//! The reason for this is that when the true goal among many possible goals is uncertain,
//! one risks excluding the true goal with high probability by optimizing for a single goal.
//! Instead, according to higher order utilitariansim, one should optimize for a cluster of goals
//! where each goal is reachable from any other.
//! For more information, see [Groupoid Assumption of Multi-Goal Optimization](https://github.com/advancedresearch/path_semantics/blob/master/papers-wip/groupoid-assumption-of-multi-goal-optimization.pdf).
//!
//! Although classical utility theory can be used to achieve a single goal,
//! this does not guarantee that achieving the goal is meaningful.
//! This is only the case if and only if the true goal is reachable from the achieved goal.
//! A true goal is defined in [Naive Zen Logic](https://github.com/advancedresearch/path_semantics/blob/master/papers-wip/naive-zen-logic.pdf) as:
//!
//! ```text
//! true_goal(X) = (goal(X) ? .me) ? me
//! ```
//!
//! It means, the true goal is the goal I believe I would have if I (the AI agent) were smarter.
//!
//! If the AI agent achieves global maximum and then self-improve,
//! in hindsight it was only meaningful to reach global maximum if and only if the new goal is reachable.
//! Therefore, achieving global maximum is meaningful if and only if the true goal is reachable
//! from the state of global maximum.
//!
//! Accumulated rewards can cloud the judgement about terminal semantics.
//! With accumulated rewards, there is nothing that predicts termination
//! when the expected utility from the environment is uncertain.
//! As long as there exists some non-terminating state with positive utility,
//! there exists a course of action that might increase utility.
//! Therefore, most AI agents who optimize accumulated rewards do not need to
//! reason about the terminal semantics in the same way that AI agents that optimizes for final rewards.
//! However, under self-improvement, accumulated rewards also requires higher order reasoning for safety.
//!
//! Since optimizing for the final reward is a strict superset of
//! optimizing for accumulated rewards, the terminal semantics in the first case
//! is a strict superset of the terminal semantics of the latter.
//!
//! One can include a term in the final reward that estimates future potential.
//! If the term for future potential can be negative, then excluding it will lead to unsafety.
//!
//! ## License
//!
//! Licensed under either of
//!  * Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
//!  * MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)
//! at your option.
//!
//! ### Contribution
//!
//! Unless you explicitly state otherwise, any contribution intentionally submitted
//! for inclusion in the work by you shall be dual licensed as above, without any
//! additional terms or conditions.

/// Reexports commonly used objects.
pub mod prelude {
    pub use super::{Ai, AiAnalysis, AiSettings, Node};
}

/// Stores action node (represented as a maximum tree).
///
/// Each node stores a maximum utility of itself or any children.
///
/// A terminal node has higher utility than any other children.
#[derive(Debug)]
pub struct Node<T, A> {
    /// Stores maximum utility of itself or any children.
    pub max: f64,
    /// Stores node data.
    pub data: T,
    /// Stores child nodes.
    ///
    /// Each child has an associated action.
    /// The associated action identifies the node.
    /// This means the action should be unique among the children.
    /// This invariant is enforced by trusted search algorithms.
    /// Use `check_unique_actions` when the input is not trusted.
    pub children: Vec<(A, Node<T, A>)>,
}

impl<T, A> Node<T, A> {
    /// Creates a new root.
    ///
    /// This sets the utility to `NaN` (not a number).
    /// There are no children, which must be added through search.
    pub fn root(data: T) -> Node<T, A> {
        Node {
            max: std::f64::NAN,
            data,
            children: vec![]
        }
    }

    /// Returns `true` if all actions among children are unique.
    ///
    /// This algorithm does not provide any proof of the collision among children,
    /// since this use case is uncommon (the invariant is enforced by search algorithms).
    pub fn check_unique_actions(&self) -> bool
        where A: Eq + std::hash::Hash
    {
        use std::collections::HashSet;

        let mut hash_set = HashSet::new();
        for &(ref a, _) in &self.children {
            if hash_set.contains(a) {return false}
            hash_set.insert(a.clone());
        }
        true
    }

    /// Returns `true` if the node is terminal.
    ///
    /// A terminal node has no children with equal or greater utility.
    /// This means that without paying attention to terminal semantics,
    /// an unsafe AI design can lead to a "stranded" situation.
    /// For more information, see the section "Terminal semantics"
    /// at this crate's top level documentation.
    pub fn terminal(&self) -> bool {self.optimal().is_none()}

    /// Returns the optimal course of action, if any.
    ///
    /// Returns `None` if no children has higher utility.
    /// This means the node is terminal.
    pub fn optimal(&self) -> Option<usize> {
        for (i, ch) in self.children.iter().enumerate() {
            if ch.1.max >= self.max {return Some(i)}
        }
        None
    }

    /// Returns optimal path from root.
    pub fn optimal_path(&self) -> Vec<usize> {
        let mut node = self;
        let mut res = vec![];
        loop {
            if let Some(i) = node.optimal() {
                node = &node.children[i].1;
                res.push(i);
            } else {
                break;
            }
        }
        res
    }
}

/// AI settings.
pub struct AiSettings {
    /// Maximum depth.
    pub max_depth: usize,
    /// Utility discount from action depth.
    ///
    /// This is usually a small positive number (e.g. `0.000001`).
    pub eps_depth: f64,
    /// Whether to run analysis.
    pub analysis: bool,
    /// Eliminate unexplored actions when using greedy search.
    pub greed_elim: bool,
    /// A limit to estimated memory usage,
    /// causing the search to terminate.
    ///
    /// This limit is only checked occationally, e.g. after breadth search,
    /// so actual memory usage before termination will exceed limit.
    pub max_mib: Option<f64>,
}

impl AiSettings {
    /// Creates new settings.
    pub fn new(max_depth: usize, eps_depth: f64) -> AiSettings {
        AiSettings {
            max_depth,
            eps_depth,
            analysis: false,
            greed_elim: true,
            max_mib: None,
        }
    }
}

/// Stores results from analysis.
pub struct AiAnalysis {
    /// Keeps track of maximum number of nodes.
    pub node_count: usize,
}

impl AiAnalysis {
    /// Creates new AI analysis.
    pub fn new() -> AiAnalysis {
        AiAnalysis {
            node_count: 0,
        }
    }
}

impl AiAnalysis {
    /// Estimates the maximum memory usage of nodes in Gibibytes.
    pub fn gib(&self, node_size: usize) -> f64 {
        (self.node_count as f64 * node_size as f64) / 1073741824.0
    }

    /// Estimates the maximum memory usage of nodes in Mibibytes.
    pub fn mib(&self, node_size: usize) -> f64 {
        (self.node_count as f64 * node_size as f64) / 1048576.0
    }

    /// Estimates the maximum memory usage of nodes in Kibibytes.
    pub fn kib(&self, node_size: usize) -> f64 {
        (self.node_count as f64 * node_size as f64) / 1024.0
    }
}

/// AI setup.
///
/// Provides a common setup for different search algorithms.
/// The search algorithm constructs a maximum tree,
/// which can be used to find the optimal course of action.
///
/// The `T` parameter is the type of node data.
/// This is used to store delta changes for undo operations.
/// Also stores data that tracks internal state of the AI agent,
/// when the AI agent is not interacting with the context (pure search).
/// Node data is stored separately from the context, making it easy
/// to analyse after constructing the maximum tree.
///
/// The `A` parameter is the type of action.
/// It describes the choices the AI agent can make in a specific context.
/// An action modifies the context and must be undone when rolling back changes.
///
/// The `C` parameter is the type of context (environment).
/// This stores the data that is necessary to calculate utility.
/// The context is modified by actions when exploring,
/// but these changes are undone when rolling back changes.
pub struct Ai<T, A, C> {
    /// Calculates utility from data and context.
    pub utility: fn(&T, &C) -> f64,
    /// Returns a list of possible actions.
    pub actions: fn(&T, &C) -> Vec<A>,
    /// Executes an action, returning new node data.
    pub execute: fn(&T, a: &A, &mut C) -> Result<T, ()>,
    /// Undoes change made to context.
    ///
    /// The data required to rollback delta changes
    /// must be stored in node data.
    pub undo: fn(&T, &mut C),
    /// Stores AI settings.
    pub settings: AiSettings,
    /// Stores analysis.
    pub analysis: AiAnalysis,
}

impl<T, A, C> Ai<T, A, C> {
    /// Computes the size of nodes in bytes.
    pub fn node_size(&self) -> usize {
        std::mem::size_of::<Node<T, A>>()
    }

    /// Calculates utility with extra terms computed from settings.
    pub fn utility_with_settings(&self, data: &T, depth: usize, ctx: &C) -> f64 {
        let utility = (self.utility)(data, ctx);
        let discount_step = -self.settings.eps_depth * depth as f64;
        utility + discount_step
    }

    /// Updates context by tracing the optimal path.
    pub fn update<'a>(&mut self, node: &'a Node<T, A>, ctx: &mut C) -> Option<usize> {
        if let Some(i) = node.optimal() {
            if (self.execute)(&node.data, &node.children[i].0, ctx).is_ok() {
                Some(i)
            } else {
                None
            }
        } else {
            None
        }
    }

    /// A sub-procedure constructing maximum tree of all available actions.
    ///
    /// Uses by other search algorithms.
    pub fn sub_breadth(&mut self, root: &mut Node<T, A>, depth: usize, ctx: &mut C)
        where A: Clone
    {
        root.children.clear();
        let actions = (self.actions)(&root.data, ctx);
        for a in &actions {
            if let Ok(data) = (self.execute)(&root.data, a, ctx) {
                let utility = self.utility_with_settings(&data, depth + 1, ctx);
                if utility > root.max {
                    root.max = utility;
                }

                // Undo changes made to context to reset state.
                (self.undo)(&data, ctx);

                root.children.push((a.clone(), Node {
                    max: utility,
                    data,
                    children: vec![],
                }));

                if self.settings.analysis {
                    self.analysis.node_count += 1;
                }
            }
        }
    }

    /// Returns `true` when estimated memory usage is exceeded, `false` otherwise.
    ///
    /// Returns `false` when analysis is deactivated.
    pub fn memory_exceeded(&self) -> bool {
        if self.settings.analysis {
            if let Some(limit) = self.settings.max_mib {
                self.analysis.mib(self.node_size()) >= limit
            } else {false}
        } else {false}
    }

    /// Only picks choices that increases utility.
    ///
    /// In order to find global maximum, it requires utility gradient to be convex.
    pub fn greedy(&mut self, root: &mut Node<T, A>, depth: usize, ctx: &mut C)
        where A: Clone
    {
        if root.max.is_nan() {
            root.max = self.utility_with_settings(&root.data, depth, ctx);
        }

        self.sub_breadth(root, depth, ctx);

        if depth >= self.settings.max_depth {return};
        if self.memory_exceeded() {return};

        if let Some(i) = root.optimal() {
            let i = if self.settings.greed_elim {
                if self.settings.analysis {
                    self.analysis.node_count -= root.children.len() - 1;
                }
                root.children.swap(i, 0);
                root.children.truncate(1);
                0
            } else {i};

            let a = &root.children[i].0;
            if let Ok(_) = (self.execute)(&root.data, a, ctx) {
                let ch = &mut root.children[i].1;
                self.greedy(ch, depth + 1, ctx);

                // Undo changes made to context to reset state.
                (self.undo)(&ch.data, ctx);

                // Update maximum utility since children are changed.
                if ch.max > root.max {
                    root.max = ch.max;
                }
            }
        }
    }

    /// Performs a full construction of the entire maximum tree.
    pub fn full(&mut self, root: &mut Node<T, A>, depth: usize, ctx: &mut C)
        where A: Clone
    {
        if root.max.is_nan() {
            root.max = self.utility_with_settings(&root.data, depth, ctx);
        }

        self.sub_breadth(root, depth, ctx);

        if depth >= self.settings.max_depth {return};
        if self.memory_exceeded() {return};

        for (ref a, ref mut ch) in &mut root.children {
            if let Ok(_) = (self.execute)(&root.data, a, ctx) {
                self.full(ch, depth + 1, ctx);

                // Undo changes made to context to reset state.
                (self.undo)(&ch.data, ctx);

                // Update maximum utility since children are changed.
                if ch.max > root.max {
                    root.max = ch.max;
                }
            }
        }
    }
}

#[cfg(test)]
mod tests {
    #[test]
    fn it_works() {
        assert_eq!(2 + 2, 4);
    }
}