pub fn induce<O, T, I, P, D, F, R>(
    dsl: &Language,
    params: &CompressionParams,
    tasks: &[T],
    original_frontiers: Vec<ECFrontier<Expression>>,
    state: I,
    proposer: P,
    proposal_to_dsl: D,
    defragment: F,
    rewrite_frontiers: R
) -> (Language, Vec<ECFrontier<Expression>>)
where O: ?Sized, T: Task<O, Representation = Language, Expression = Expression>, I: Sync, P: Fn(&I, &Language, &[(TypeScheme, Vec<(Expression, f64, f64)>)], &CompressionParams, &mut Vec<T::Expression>) + Sync, D: Fn(&I, &T::Expression, &mut Language, &[(TypeScheme, Vec<(Expression, f64, f64)>)], &CompressionParams) -> Option<f64> + Sync, F: Fn(Expression) -> Expression, R: Fn(&I, T::Expression, Expression, &Language, &mut Vec<(TypeScheme, Vec<(Expression, f64, f64)>)>, &CompressionParams),
Expand description

This function makes it easier to write your own compression scheme. It takes the role of a single compression step, separating it into sub-steps that are decent to implement in isolation.

This is a sophisticated higher-order function — tread carefully.

  • state: I can be mutated by making it Arc<RwLock<_>>, though it will often just be () unless you really need it.

  • type X is for a candidate, something which can be used to update a dsl.

  • proposer pushes candidates to the given vector.

  • proposal_to_dsl adds the candidate to the dsl and returns a new joint minimum description length for the dsl. For example, this may be set to:

    |_state, expr, dsl, frontiers, params| {
        if dsl.invent(expr.clone(), 0.).is_ok() {
            Some(dsl.inside_outside(frontiers, params.pseudocounts))
        } else {
            None
        }
    }
    
  • defragment is most often a no-op and can be set to |x| x. It allows you to effectively change the output of proposal_to_expr after scoring has been done. This is useful for fragment grammar compression, because scoring with inventions that have free variables (i.e. non-closed expressions) will let inside-outside capture those uses without having to rewrite the frontiers.

  • rewrite_frontiers finally takes the highest-scoring dsl, which is guaranteed to have a single latest invention (accessible via dsl.invented.last().unwrap()) equal to defragment(proposal_to_expr(_, proposal)) for some generated proposal. The lambda::Expression it is supplied is the non-defragmented proposal. There’s no need to rescore the frontiers, that’s done automatically.

We recommended to make a function that adapts this into a four-argument induce_my_algo function by filling in the higher-order functions. See the source code of this project to find the particular use of this function that gives Language::compress using a fragment-grammar-like compression scheme.