Parser

Trait Parser 

Source
pub trait Parser: Send + 'static {
    // Provided methods
    fn parse(&mut self) -> bool { ... }
    fn update(&mut self, pa: &mut Pass, buffer: &Handle, on: Vec<Range<Point>>) { ... }
    fn before_get(&mut self) { ... }
    fn before_try_get(&mut self) -> bool { ... }
}
Expand description

A Buffer parser, that can keep up with every Change that took place

A parser’s purpose is generally to look out for changes to the Buffer’s Bytes, and update some internal state that represents them. Examples of things that should be implemented as Parsers are:

  • A tree-sitter parser, or other syntax tree representations;
  • Regex parsers;
  • Language server protocols;

But Parsers don’t have to necessarily do “big and complex” things like creating a language tree, they can be more simple, by, for example, acting on each Selection on the screen.

If you want a walkthrough on how to make a Parser, I would recommend reading the book (TODO). The rest of the documentation here is mostly just describing a final implementation, not walking through its creation.

§What a parser does

The gist of it is that a Parser will be called to read the Bytes of the Buffer as well as any Changes that are done to said Buffer. Duat will then call upon the Parser to “act” on a region of the Buffer’s Text, this region being determined by what is shown on screen, in order to help plugin writers minimize the work done.

When creating a Parser, you will also be given a BufferTracker. It will be used to keep track of the Changes, and it is also used by the Parser to tell which Range<usize>s of the Text the Parser cares about. So, for example, if you’re matching non-multiline regex patterns, for every Change, you might want to add the lines of that Change to the BufferTracker, and when Duat decides which ranges need to be updated, it will inform you: “Hey, you asked for this range to be updated, it’s on screen now, so update it.”.

§The functions from the Parser trait

There are 4 functions that you need to care about, but you may choose to implement only some of them:

§Parser::parse

This function’s purpose is for the Parser to update its internal state after Changes take place. Here’s the general layout for a synchronous version of this function:

use duat::prelude::*;

struct CharCounter {
    count: usize,
    ch: char,
    tracker: BufferTracker,
}

impl Parser for CharCounter {
    fn parse(&mut self) -> bool {
        // Fetches the latest Changes and Bytes of the Buffer
        self.tracker.update();

        // A Moment is a list of Changes
        // For the sake of efficiency, Changes are sent in bulk,
        // rather than individually
        for change in self.tracker.moment().changes() {
            let bef_count = change.taken_str().matches(self.ch).count();
            let aft_count = change.taken_str().matches(self.ch).count();
            self.count += aft_count - bef_count;
        }

        // Return true if you want to call `Parser::update`, for
        // this Parser, since we never change the Buffer, it's fine
        // to always return false.
        false
    }
}

The example above just keeps track of every occurance of a specific char. Every time the Buffer is updated, the parse function will be called, and you can use BufferTracker::update to be notified of every Change that takes place in the Buffer.

§Parser::update

The purpose of this funcion is for the Parser to modify the Buffer itself. In the previous funcion, you may notice that you are not given access to the Buffer directly, nor are you given a Pass in order to access global state. That’s what this function is for.

Below is the rough layout for an implementation of this function, in this case, this function “resets its modifications” every time it is called:

use duat::prelude::*;

struct HighlightMatch {
    tagger: Tagger,
}

impl Parser for HighlightMatch {
    fn update(&mut self, pa: &mut Pass, handle: &Handle, on: Vec<Range<Point>>) {
        // Remove all tags previously added by this Parser.
        handle.text_mut(pa).remove_tags(self.tagger, ..);

        // The suffix of the main cursor's current word.
        let Some(range) = handle.edit_main(pa, |c| c.search_fwd(r"\A\w+").next()) else {
            return;
        };
        // The prefix of said cursor's word.
        let start = handle
            .edit_main(pa, |c| c.search_rev(r"\w*\z").next())
            .map(|range| range.start)
            .unwrap_or(range.start);

        // The TextParts struct lets you read from the Bytes while writing to the Tags.
        let mut parts = handle.text_parts(pa);
        let pat = parts.bytes.strs(start..range.end);
        let form_id = form::id_of!("same_word");

        // Highlight every identical word.
        for range in on {
            for range in parts.bytes.search_fwd(r"\w+", range.clone()).unwrap() {
                if parts.bytes.strs(range.clone()) == pat {
                    parts.tags.insert(self.tagger, range, form_id.to_tag(50));
                }
            }
        }
    }
}

The Parser above reads the word under the main cursor (if there is one) and highlights every ocurrence of said word on screen. This function would be called if Parser::parse returns true, i.e. when the Parser is “ready” to update the Buffer. The default implementation of Parser::parse is to just return true.

§Note

In the example above, the BufferTracker is acting slightly differently. When setting up this Parser, I called BufferTracker::track_area.

This function makes it so, instead of tracking changed Range<Point>s, Parser::update will always return a list of ranges equivalent to the printed region of the Text. This way, I can update only the stuff on screen.

In general, given the Parser::parse and Parser::update functions, you can roughly divide which ones you’ll implement based on the following criteria:

  • If your Parser does not update the Buffer, and just keeps track of Changes, e.g. a word counter, or a filetype checker, etc, then you should only have to implement the Parser::parse function.
  • If your Parser actively updates the Buffer every time it is printed, e.g. the word match finder above, or a current line highlighter, then you should only have to implement the Parser::update function.
  • If, in order to update the Buffer, you need to keep track of some current state, and you may even update the Parser’s state in other threads, like a treesitter parser for example, then you should implement both.

§Parser::before_get and Parser::before_try_get

These functions have the same purpose as Parser::parse, but they are called before calls to Buffer::read_parser, Buffer::write_parser, and their try equivalents.

They serve to kind of “prepare” the Parser for functions that access it, much like Parser::parse “prepares” the Parser for a call to Parser::update.

The purpose of these functions is to only ever update the Parser when that is actually necessary. The most notable example of this is the duat-jump-list crate. That crate defines a Parser that only ever updates its internal state when it is accessed externally. The reason for that is because it is only used to store and retrieve previous versions of the Selections of the Buffer, so it doesn’t need to update itself every time there are new changes to the Buffer, but only when it is requested.

§Note

You can keep a Parser private in your plugin in order to prevent the end user from reading or writing to it. You can then create standalone functions or implement traits on the Buffer widget in order to give controled access to the parser. For an example of this, you can see the duat-jump-list crate, which defines traits for saving and retrieving jumps, but doesn’t grant direct access to the parser.

Provided Methods§

Source

fn parse(&mut self) -> bool

Parses the Bytes of the Buffer

This function is called every time the Buffer is updated, and it’s where you should update the internal state of the Parser to reflect any Changes that took place.

Source

fn update(&mut self, pa: &mut Pass, buffer: &Handle, on: Vec<Range<Point>>)

Updates the Buffer in some given Range<Point>s

As this function is called, the state of the Parser needs to already be synced up with the latest Changes to the Buffer.

The list of Ranges is the collection of Ranges that were requested to be updated and are within the printed region of the Buffer.

Do note that, if more regions become visible on the screen (this could happen if a Conceal tag is placed, for example), this function will be called again, until the whole screen has been parsed by every Parser

§NOTES

One other thing to note is that the on range is just a suggestion. In most circumstances, it would be a little convenient to go slightly over that range. For example, a regex searcher should look only at the range provided, but if a match goes slightly beyond the range, it is fine to add Tags in there.

Finally, keep in mind that Tags are not allowed to be repeated, and you can use this to your advantage, as in, instead of checking if you need to place a Tag in a certain spot, you can just place it, and Duat will ignore that request if that Tag was already there.

Source

fn before_get(&mut self)

Prepare this Parser before Buffer::read_parser call

The Buffer::read_parser/Buffer::write_parser functions block the current thread until the Parser is available. Therefore, before_get should finish all parsing, so if the parsing is taking place in another thread, you’re gonna want to join said thread and finish it.

If the Parser’s availability doesn’t rely on other threads (which should be the case for almost every single Parser), then this function can just be left empty.

Source

fn before_try_get(&mut self) -> bool

Prepare the Parser before Buffer::try_read_parser call

The purpose of try_read_parser, unlike read_parser, is to only call the function passed if the Parser is ready to be read. If it relies on a thread finishing its processing, this function should return true only if said thread is ready to be merged.

If the Parser’s availability doesn’t rely on other threads (which should be the case for almost every single Parser), then this function should just return true all the time.

Implementors§