bfom 0.1.22

Brendan's Flavor of Markdown: I'll build my own markdown format, what could go wrong?
Documentation
# Brendan's Flavor of Markdown

This document should give a good overview of how to use BFoM.  
If you are used to existing markdown flavors then this should be pretty easy to pick up.  
The goal of this flavor is to have greater flexibility (more formatting options) with being more consistent (you can align headers nicely) 
while also being easy to parse (no indented code spans or lazy continuation).

## Blocks
Blocks are assessed on a line by line basis based on what the first few characters are.  
Leading spaces are ignored for this purpose.  
Lazy continuation is not permitted.

### Code
Indented code blocks are removed, this is mostly due to the issues outlined in [background].  
This in turn has changes for the other blocks which are no longer bound by the 3 indentation max rule.


Valid:
````````````````````````````````````````````````````````````````````````
```
fn main () {
  // content
}
```

`````` rust
fn main () {
  // content
}
``````

``````````````````````````````
fn main () {
  // content
}
``````````````````````````````
````````````````````````````````````````````````````````````````````````

Not valid:

````````````````````````````````````````````````````````````````````````
`````````````
fn main () {
  // content
}
``````````````````````````````
````````````````````````````````````````````````````````````````````````


[background]: https://brendan.ie/blog/2

#### Summary
* Leading Whitespace is ignored.
* Indented code blocks are removed.
* Amount of backticks is at least 3.
  * No upper limit.
  * Closer must be matched to the Opener.
* Content is **not** processed for inline markdown.  


### Header
Denoted by whitespace followed by one to 6 `#` to denote h1-h6.  
Leading spaces are ignored, this is to facilitate the accommodations that existing implementations have for aligning headers.  
This goes beyond them in allowing a h1 to be aligned with a h6.

Valid:
```
# h1
## h2
### h3
#### h4
##### h5
###### h6

     # h1
    ## h2
   ### h3
  #### h4
 ##### h5
###### h6
```

#### Summary
* Leading Whitespace is ignored.
* Setext headers are not supported.
* Initial #'s greater than 6 are allowed but anything after the first 6 will be treated as text.
* Content is processed for inline markdown.

### Horizontal Rule

Limiting it to just `-` instead of `*` or `_`.
This is due to `*` and `_` being used in spans for emphasis and underlining respectfully.  
`-` was also chosen because it looks pretty similar to teh hr element rendered in html.

Leading whitespace is ignored followed by four `-` without spaces and nothing else aside from whitespace on the line.

Valid:
```
----
          ------------
   ----  --------  --------  --------  ----
````  ````
---------------------------------------------------------------------------------------------
```


#### Summary
* Leading Whitespace is ignored.
* Only `-` is allowed.
* Initial sequence must be at least four `-` like `----` after which: 
  * Any length is valid.
  * Can be combination of `-` and whitespace.
* Content is **not** processed for inline markdown (no content to process).  


### Blockquotes

For a blockquote each line must start with `>` with the leading whitespace ignored.  
Lazy continuation is not permitted.

Due to this it can correctly parse the following sequence:
```
>>> foo
> bar
>> baz
```
Into: 
```
<blockquote>
  <blockquote>
    <blockquote>
      <p>foo</p>
    </blockquote>
  </blockquote>
  <p>bar</p>
  <blockquote>
    <p>baz</p>
  </blockquote>
</blockquote>
```

And this: 
```
> foo
 bar
> baz
```
Into:
```
<blockquote>
  <p>foo</p>
</blockquote>
<p>bar</p>
<blockquote>
  <p>baz</p>
</blockquote>
```

#### Summary
* Leading Whitespace is **not** ignored.  
  * This enables having a `>` at the start of a paragraph without it being a quote
* Lazy continuation is not permitted.  
* Content is processed recursively for nestled blocks.

### Lists

Like Blockquotes lazy continuation is not permitted.  
Also, due to the removal of indented code blocks there can be unlimited indentation.

Valid:
```
1. Item 1
99999. Item 99999

    1. Item 1
99999. Item 99999

    1. Item 1 Line 1
       Item 1 Line 2
99999. Item 99999 Line 1
       Item 99999 Line 2
       
      1. Item 1 Line 1
         Item 1 Line 2
  99999. Item 99999 Line 1
         Item 99999 Line 2
         
         
      1) Item 1 Line 1
         Item 1 Line 2
  99999) Item 99999 Line 1
         Item 99999 Line 2
       
       
       
    * Item 1
      * Item 1 Nestled 1
    * Item 2
    
```

Not Valid:
```
      1. Item 1 Line 1
Item 1 Line 2
  99999. Item 99999 Line 1
Item 99999 Line 2
```

In my own opinion lazy continuation makes it harder to read the data/text as well as mess up the presentation in the markdown file itself.

#### Summary
* Leading Whitespace is ignored.
* Lazy continuation is not permitted.
* Ordered (numbered) list.
  * `.` and `)` permitted directly after the number.
  * A space is required directly after the identifier.
  * 999999999 is the max number permitted.
* Unordered List.
  * `*`, `+` and `-` are permitted.
  * A space is required directly after the identifier.
* Content is processed recursively for nestled blocks.

### Tables
This is one of the extensions that Gitlab and others introduced in order to have a native way of representing tables.  
Most of this is adapted from [Github's implementation].

[Github's implementation]: https://github.github.com/gfm/#tables-extension-

#### Summary
* Leading Whitespace is ignored.
* Line must begin with `|`.
* Delimiter row is optional.
  * The delimiter row sets the alignment for subsequent rows.
  * It is possible to have multiple delimiter rows.
  * It is possible to have the delimiter row before the header row.
* First row is automatically the headers
* Content is processed for inline markdown.

Valid: 
```
| Header 1 | Header 2 |


| Header 1    | Header 2    |
| Row 1 left  | Row 1 Col 2 |


| Header 1    ||
| Row 1 left  | Row 1 Col 2 |


| Header 1    |             |
| Row 1 left  | Row 1 Col 2 |


|:------------|-------------|
| Header 1    | Header 2    |
| Row 1 left  | Row 1 Col 2 |


| Header 1    | Header 2    |
|:------------|-------------|
| Row 1 left  | Row 1 Col 2 |


| Header 1    | Header 2    |
|:------------|
| Row 1 left  | Row 1 Col 2 |


# Setting alignment for lines
| Header 1      | Header 2          |
|:--------------|-------------------|
| Row 1 left    | Row 1 Col 2       |
|:-------------:|
| Row 2 center  | Row 2 Col 2       |
|--------------:|
| Row 3 right   | Row 3 Col 2       |
|---------------|
| Row 4 default | Row 4 Col 2       |
| Row 5 default | Row 5 Col 2       |
|---------------|------------------:|
| Row 6 default | Row 6 Col 2 Right |

```



### HTML
Html blocks are passed straight through to the final document.  
No markdown in the block is processed, html acts as an override at block level, unlike [Commonmark][Commonmark HTML Block] which breaks users expectations and generates invalid html.



[Commonmark HTML Block]: https://spec.commonmark.org/0.30/#example-148 "How to not do a html block"


#### Finding blocks
* Identified by the line starting with `<`.
* The tag is found
  * Normal tags mark the Closer as `</{tag}>`
  * Special tags like comment or cdata have set their appropriate closers.
  * A counter to count the nestling is incremented.
* Each line is then searched to find the closer.
* If another opener of the same type is found then it increments the counter.
* If a closer is found it de-increments the counter.
* if the counter equals 0 then it checks the reminder of the line it is on.
  * If there is nothing but blank space after the closer  it closes out teh block.
  * If there is content afterwards it checks if the tag can be part of a paragraph. 
    * If it can be then the block type is changed to Paragraph.
    * If not, it is closed out as a html block and the position is marked where the analysis of the next block is started.
      * If next block is not html or a paragraph the content is ignored
* Content is **not** processed.

The counter is to handle nestled tags of the same type.  
Otherwise in the example below it would close out the block on the first `</div>` which would create invalid html.

```
<div>
  <div>
  </div>
</div>
```

Checking content after the closer is to catch items that aught to be part of paragraphs:

```
<a href="example.com">A link</a> that points to example.com.
```

This also catches html blocks that are strangely spaced:
```
<div>
</div><table>
</table>
```
Anything that is not html or paragraph after the closing html tag is ignored

```
<div>
</div> # Not a header
```


### Paragraph
Paragraphs are the catch-all for anything that does not fit in the above.

#### Summary
* Leading Whitespace is ignored.
  * More specifically it won't get rendered in the paragraph tag
* Content is processed for inline markdown.



## Spans

I have made several changes to spans:

* All span identifiers work in pairs.
  * Code now uses two backticks instead of one or two.
  * Autolinks now start with `<<` and end with `>>` as opposed to `<` and `>`.
* No requirement for a non-space character after an opener or before a closer.
* Several identifiers have changed.
  * Emphasis is now using `//` as its opener and closer.
* Several new identifiers: Underline, strikethrough, spoiler. 
* First matched closer closes out the span.  
* All spans have equal priority.  
* Backslash cancels the activation of opener/closer.
* If a span is not closed then the identifiers are marked as text and not identifiers.



Most spans can be nestled within each other.  

```
>! spoilered  //emphasis spoilered// !< 

<span class='md-spoiler'> spoilered  <em>emphasis spoilered</em> </span>
```

Closing of existing spans is prioritised over opening new spans.
```
**strong  __not underlined**__

<strong>strong  __not underlined</strong>__
```

To nestle spans of the same type inside each other you have to escape the inner content.
```
//Sentence with \/\/nestled emphasis\/\///.

<em>Sentence with <em>nestled emphasis</em></em>.
```

#### Definitions  
Opener: The tag needed to start the span.
Closer: The tag needed to end the span.  
Spans?: If the span can have further spans inside it.


### Emphasis

Identifiers have changed from ``*`` or ``_`` to ``//``.  

Using // as text in em tags tends to slant like that.


| Opener | Closer | Spans? |
|:------:|:------:|:------:|
|   //   |   //   |  Yes   |

```
Sentence with //emphasis//.  
Sentence with <em>emphasis</em>. 


Sentence with// emphasis//.  
Sentence with<em> emphasis</em>. 
```

### Strong

Identifiers have changed from ``**`` or ``__`` to ``**``.  

No other change of note.

| Opener | Closer | Spans? |
|:------:|:------:|:------:|
|   **   |   **   |  Yes   |

```
Sentence with **strong**.  
Sentence with <strong>strong</strong>. 


Sentence with** strong**.  
Sentence with<strong> strong</strong>. 
```

### Code

Identifiers have changed from \` or \`\` to \`\`.  

Existing standards allow for both single and double backticks, however as noted in [background] this can cause discrepancies, even within the same service.

| Opener | Closer | Spans? |
|:------:|:------:|:------:|
|  \`\`  |  \`\`  |  No    |

```
Sentence with ``code``.  
Sentence with <code>code</code>. 


Sentence with`` code``.  
Sentence with<code> code</code>. 
```

### Autolinks

Opener has changed from ``<`` to ``<<``.  
Opener has changed from ``>`` to ``>>``.  

Aside from the identifier changes it follows commonmark pretty spot on.  
If it detects mailto:/tel: it sets the link to the appropriate type as well as stripping tel:/mailto: from the displayed text

| Opener | Closer | Spans? |
|:------:|:------:|:------:|
|   <<   |   >>   |  No    |

```
Sentence with <<example.com>>.  
Sentence with <a target='_blank' rel='noopener noreferrer' href='example.com'>example.com</a>.

Sentence with <</foo>>.  
Sentence with <a target='_blank' rel='noopener noreferrer' href='/foo'>/foo</a>. 

Sentence with <<>>.  
Sentence with <a target='_blank' rel='noopener noreferrer' href=''></a>.

Sentence with << >>.  
Sentence with <a target='_blank' rel='noopener noreferrer' href=' '> </a>.  

Sentence with <<user@example.com>>. 
Sentence with <a target='_blank' rel='noopener noreferrer' href='mailto:user@example.com'>user@example.com</a>. 

Sentence with <<mailto:user@example.com>>. 
Sentence with <a target='_blank' rel='noopener noreferrer' href='mailto:user@example.com'>user@example.com</a>. 

(phone number is the Irish dummy one for the arts, same purporse as example.com)
Sentence with <<tel:+353 20 910 1234>>. 
Sentence with <a target='_blank' rel='noopener noreferrer' href='tel:+353 20 910 1234'>+353 20 910 1234</a>. 
```

### Links/Images
Links and images have a few minor differences.  
Link and images can also have other identifiers within their text area, if it is escaped as appropriate.

#### Links

##### First they cannot be multiline.

Commonmark:
```
[link](   /uri
"title"  )

<p><a href="/uri" title="title">link</a></p>
```
Would render as this instead:
```
<p>
    [link](   /uri
    "title"  )
</p>
```

##### Second is that links have the same priority as other inline structures: First come, First served.

Commonmark:
```
[foo``](/bar``)
<p>[foo<code>](/bar</code>)</p>

**[foo**](/bar)
<p>*<a href="/bar">foo*</a></p>
```
Would render as this instead (target and rel removed for brevity and comparison):
```
<p><a href="/bar``">foo``</a></p>

<p><strong>[foo</strong>](/bar)</p>
```

#### Images
The above also applies to images.

The image must have its own !, not shared with any other identifier:
```
>!![foo]/bar!<
<span class='md-spoiler'><img src='/bar' alt='foo' /></span>

>![foo]/bar!<
<span class='md-spoiler'><a target='_blank' rel='noopener noreferrer' href='/bar'>foo</a></span>
```


#### Nestled Image in Link

In the below example the closing ``]`` for the image must be escaped due to the First come, First served principle.
```
[![Alt Text\][img\]][link]

[link]: /link_url
[img]: /img_url

<a href='/link_url'><img src='/img_url' alt='Alt Text' /></a>
```

### Underline

Identifiers are ``__`` for opener and closer.  
In existing implementations the underscore ``_`` denotes either Emphasis or Strong.  
It is my opinion that ``__text__`` already looks like underlined text as my mind fills in the line.


| Opener | Closer | Spans? |
|:------:|:------:|:------:|
|   __   |   __   |  Yes   |

```
Sentence with __underline__.  
Sentence with <u>underline</u>.  


Sentence with__ underline__.  
Sentence with<u> underline</u>.  
```

### Strikethrough

Identifiers are ``~~`` for opener and closer.  
It is already common outside of commonmark.


| Opener | Closer | Spans? |
|:------:|:------:|:------:|
|   ~~   |   ~~   |  Yes   |

```
Sentence with ~~strikethrough~~.  
Sentence with <s>strikethrough</s>.  

Sentence with~~ strikethrough~~.  
Sentence with<s> strikethrough</s>.  
```


### Spoiler
Spoiler is interesting in that there is no spoiler element in html, it must be managed by scripts or css.   
There have also been a few implementations of it:

| Platform        | Opener       | Closer            |
|:----------------|:------------:|:-----------------:|
| Discord         | &vert;&vert; |    &vert;&vert;   |
| Reddit          |      >!      |        !<         |
| StackOverflow   |      >!      | None, Block Level |

Discord's method has conflicts with tables and is the main outlier.  
StackOverflow's version is for spoiler blocks but the opener is the same.

| Opener | Closer | Spans? |
|:------:|:------:|:------:|
|   >!   |   !<   |  Yes   |



```
Sentence with >!spoilered!<.  
Sentence with <span class='md-spoiler'>spoilered</span>. 

Sentence with>! spoilered!<.  
Sentence with<span class='md-spoiler'> spoilered</span>. 
```

### HTML

Html is passed straight through with no processing of internal content.