942 lines
36 KiB
Plaintext
942 lines
36 KiB
Plaintext
= Specification =
|
|
|
|
The following is a draft specification for the *vimwiki* markup language. It is
|
|
provided as a guideline for consistent parsing and rendering of vimwiki
|
|
content.
|
|
|
|
Similar to the approach taken with [[https://www.commonmark.org/|commonmark]],
|
|
this document attempts to specify *vimwiki* syntax unambiguously. It contains
|
|
examples of the language and describes the specifics of the language in a way
|
|
that tooling can more easily define parsers and generators.
|
|
|
|
== Version ==
|
|
|
|
Current: *0.1.0*
|
|
|
|
This specification is versioned in order to provide a stable point of reference
|
|
for external tools. [[https://semver.org/|Semantic versioning]] is used to keep
|
|
track of the current state of the specification. *MAJOR.MINOR.PATCH* is the
|
|
format.
|
|
|
|
While the specification remains with a zero major version, the contents of this
|
|
specification can change without any requirement to maintain compatibility.
|
|
Once a non-zero major version is released, new vimwiki language elements may
|
|
only be added in minor releases and any breaking change must be reflected by a
|
|
major release. Tweaks in language that do not add, alter, or remove elements of
|
|
vimwiki (such as typos, clarifications, etc) may be made with patch releases.
|
|
|
|
The specification will continue to remain in a development mode (zero major
|
|
mode) until the language has become relatively stable.
|
|
|
|
== Language ==
|
|
|
|
The following will describe the individual elements - hereby described as
|
|
components - of the vimwiki language. It will cover each component's purpose
|
|
and a clear description of the syntax.
|
|
|
|
For details on parsing prescedence, see the [[#Specification#Parser Details|companion section]].
|
|
|
|
=== Primitives ===
|
|
|
|
In order to define the vimwiki language, we first need to present several
|
|
definitions for primitive building blocks used to shape up higher-level
|
|
components. Relevant definitions are borrowed from
|
|
[[https://spec.commonmark.org/0.29/#characters-and-lines|commonmark characters and lines]].
|
|
|
|
A *character* in vimwiki is a valid UTF-8 code point.
|
|
|
|
A *line* is a sequence of zero or more [[#Specification#Language#Primitives#character|characters]] other than a
|
|
newline (`U+000A` aka `\n`) or carriage return (`U+000D` aka `\r`), followed by
|
|
a [[#Specification#Language#Primitives#line ending|line ending]] or by the end of a file.
|
|
|
|
A *line ending* is a newline (`U+000A` aka `\n`), a carriage return (`U+000D`
|
|
aka `\r`) not followed by a newline, or a carriage return and a following
|
|
newline (`\r\n`).
|
|
|
|
A line with no characters, or a line containing only spaces (`U+0020`) or tabs
|
|
(`U+0009`), is called a *blank line*.
|
|
|
|
A *whitespace character* is a space (`U+0020`) or tab (`U+0009`).
|
|
|
|
=== Block Components ===
|
|
|
|
The vimwiki language has a variety of syntax that represent components within
|
|
a page. In this section, we discuss block-level components, which are vimwiki
|
|
syntax that are standalone and can comprise one or more entire lines within
|
|
a file.
|
|
|
|
==== Blockquote ====
|
|
|
|
A blockquote has two different forms available: indented text or text prefixed
|
|
with a right angle bracket or chevron `>`. Its purpose is to convey an extended
|
|
quotation.
|
|
|
|
*Form 1*:
|
|
|
|
{{{vimwiki
|
|
This is a blockquote
|
|
that exists on more than one line
|
|
}}}
|
|
|
|
*Form 2*:
|
|
|
|
{{{vimwiki
|
|
> This is a blockquote
|
|
> that exists on more than one line
|
|
}}}
|
|
|
|
===== Syntax =====
|
|
|
|
*Form 1*:
|
|
|
|
A blockquote is made of *one* or more [[#indented blockquote line|indented blockquote lines]]
|
|
|
|
An *indented blockquote line* is made of the following:
|
|
1. Four or more [[#whitespace characters|whitespace characters]]
|
|
2. All characters up until a [[#line ending|line ending]]
|
|
3. A [[#line ending|line ending]] or end of input
|
|
|
|
*Form 2*:
|
|
|
|
A blockquote is made of *one* or more [[#chevron blockquote line|chevron blockquote lines]], which may be
|
|
separated by zero or more [[#blank line|blank lines]]
|
|
|
|
A *chevron blockquote line* is made of the following:
|
|
1. Starts at the beginning of a line
|
|
2. A prefix right angle bracket or chevron (`U+003E` aka `>`)
|
|
3. A [[#whitespace character|whitespace character]] such as ' ' or '\t'
|
|
4. All characters up until a [[#line ending|line ending]]
|
|
5. A [[#line ending|line ending]] or end of input
|
|
|
|
*Extra Notes*: Each line of a blockquote, minus the indentation or chevron, is
|
|
trimmed to remove all leading and trailing [[#whitespace character|whitespace characters]].
|
|
|
|
==== Definition List ====
|
|
|
|
A definition list is composed of a series of terms and associated definitions.
|
|
It mirrors the functionality available in an [[https://developer.mozilla.org/en-US/docs/Web/HTML/Element/dl|HTML Description List]].
|
|
|
|
{{{vimwiki
|
|
Term 1:: Some definition
|
|
Term 2:: First def
|
|
:: Second def with [[link]]
|
|
Term3::
|
|
:: Some *bold* definition
|
|
}}}
|
|
|
|
===== Syntax =====
|
|
|
|
A *definition list* is composed of one or more [[#term and definitions|term and definitions]]
|
|
|
|
A *term and definitions* is composed of one [[#term line|term line]] and
|
|
zero or more [[#definition line|definition lines]]
|
|
|
|
A *term line* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. One or more [[#inline components|inline components]] before the sequence `::`
|
|
3. The sequence `::`
|
|
4. An optional one or more [[#inline components|inline components]] before [[#line ending|line ending]]
|
|
to be the first definition
|
|
5. A [[#line ending|line ending]] or end of input
|
|
|
|
A *definition line* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. The sequence `::`
|
|
3. One or more [[#inline components|inline components]] before [[#line ending|line ending]]
|
|
4. A [[#line ending|line ending]] or end of input
|
|
|
|
*Extra Notes*: Each term and definition is trimmed to remove all leading and
|
|
trailing [[#whitespace character|whitespace characters]].
|
|
|
|
==== Divider ====
|
|
|
|
A divider is composed of a sequence of dashes (`U+002D`). It mirrors the
|
|
functionality of the [[https://developer.mozilla.org/en-US/docs/Web/HTML/Element/hr|HTML Horizontal Rule]].
|
|
|
|
{{{vimwiki
|
|
----
|
|
}}}
|
|
|
|
===== Syntax =====
|
|
|
|
A *divider* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. Four or more dashes (`U+002D`)
|
|
3. A [[#line ending|line ending]] or end of input
|
|
|
|
==== Header ====
|
|
|
|
A header is composed of some content surrounded by equals sign (`U+003D` aka
|
|
`=`) of equal length. It mirrors the functionality of the [[https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Heading_Elements|HTML Heading]].
|
|
|
|
{{{vimwiki
|
|
= Some Header =
|
|
== Some Sub Header ==
|
|
= *Bold* Header with [[link]] =
|
|
}}}
|
|
|
|
===== Syntax =====
|
|
|
|
A *header* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. Zero or more [[#whitespace character|whitespace characters]] (implying
|
|
whether or not a header is centered)
|
|
3. One or more equal sign (`U+003D`) characters
|
|
4. One or more [[#inline components|inline components]]
|
|
5. An equivalent number of equal sign characters as in step #3
|
|
6. A [[#line ending|line ending]] or end of input
|
|
|
|
*Extra Notes*: Each header's content, minus the equals signs, is trimmed to
|
|
remove all leading and trailing [[#whitespace character|whitespace characters]].
|
|
For example, `= header =` is equal to `=header=`.
|
|
|
|
==== List ====
|
|
|
|
A list is composed of a series list items, each being comprised
|
|
of [[#inline components|inline components]] and [[#list|sub lists]]. It mirrors the
|
|
functionality available in an [[https://developer.mozilla.org/en-US/docs/Web/HTML/Element/ul|HTML Unordered List]]
|
|
and [[https://developer.mozilla.org/en-US/docs/Web/HTML/Element/ol|HTML Ordered List]].
|
|
|
|
{{{vimwiki
|
|
- List item 1 has *bold* and [[links]]
|
|
- List item 2 has content
|
|
1. Ordered sublist
|
|
2. within list item 2
|
|
}}}
|
|
|
|
===== Syntax =====
|
|
|
|
A *list* is composed of one or more [[#list item|list items]].
|
|
|
|
A *list item* is composed of a [[#starting list item line|starting list item line]]
|
|
and zero or more [[#companion list item line|companion list item lines]].
|
|
|
|
A *starting list item line* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. Zero or more [[#whitespace character|whitespace characters]] that signify
|
|
the level of indentation to use when understanding if later content is
|
|
still associated with this list item, if a new list item is the beginning
|
|
of a sublist, if a new list item is a sibling, or if a new list item
|
|
is the sibling of a parent
|
|
3. One of the following prefixes that determines if the type of list item:
|
|
* Hyphen (`U+002D` aka `-`) is for an unordered list
|
|
* Asterisk (`U+002A` aka `*`) is for an unordered list
|
|
* Pound (`U+0023` aka `#`) is for an ordered list
|
|
* One or more digits followed by a period (`U+002E` aka `.`) or
|
|
a right parenthesis (`U+0029` aka `)`) is for an ordered list
|
|
* One or more lowercase alphabetic (`a-z`) followed by a period
|
|
(`U+002E` aka `.`) or a right parenthesis (`U+0029` aka `)`) is for an
|
|
ordered list
|
|
* One or more uppercase alphabetic (`A-Z`) followed by a period
|
|
(`U+002E` aka `.`) or a right parenthesis (`U+0029` aka `)`) is for an
|
|
ordered list
|
|
* One or more lowercase Roman numerals (any of `ivxlcdm`) followed by a
|
|
period (`U+002E` aka `.`) or a right parenthesis (`U+0029` aka `)`) is
|
|
for an ordered list
|
|
* One or more uppercase Roman numerals (any of `IVXLCDM`) followed by a
|
|
period (`U+002E` aka `.`) or a right parenthesis (`U+0029` aka `)`) is
|
|
for an ordered list
|
|
4. A [[#whitespace character|whitespace character]]
|
|
5. An optional [[#todo attribute|todo attribute]] and additional [[#whitespace character|whitespace character]]
|
|
6. Zero or more [[#inline components|inline components]]
|
|
7. A [[#line ending|line ending]] or end of input
|
|
|
|
A *companion list item line* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. Zero or more [[#whitespace character|whitespace characters]] that total
|
|
as many or more as the [[#starting list item line|starting list item line]] indentation
|
|
3. One of the following:
|
|
* The start of a new [[#list|list]] (to be treated as a sublist of the
|
|
current list item)
|
|
* A series of one or more [[#inline components|inline components]]
|
|
(to be added to the content of the current list item) followed by
|
|
a [[#line ending|line ending]] or end of input
|
|
* A [[#blank line|blank line]] if there is guaranteed to still be some
|
|
list item content in later lines
|
|
|
|
A *todo attribute* is composed of surrounding square brackets in the form
|
|
of a left square bracket (`U+005B` aka `[`) and right square bracket
|
|
(`U+005D` aka `]`). Inbetween the square brackets is a single character to
|
|
denote the todo status and is one of the following:
|
|
* A space (`U+0020` aka ' ') meaning 0% or incomplete
|
|
* A period (`U+002E` aka `.`) meaning 1-33% progress
|
|
* A lowercase o (`U+006F` aka `o`) meaning 34-66% progress
|
|
* An uppercase O (`U+004F` aka `O`) meaning 67-99% progress
|
|
* An uppercase X (`U+0058` aka `X`) meaning 100% or completed
|
|
* A hyphen (`U+002D` aka `-`) meaning rejected
|
|
|
|
*Extra Notes*: Because of the ambiguity of alphabetic list items and Roman
|
|
numerals, which are composed of specific alphabetic characters in various
|
|
arrangements, a list needs to be evaluated across all of its items to determine
|
|
if a list item's type is Roman or alphabetic. If all list items begin with
|
|
valid Roman numerals, then the types are Roman numerals. If any list item is
|
|
not a valid Roman numeral, then all list item type's for those prefixes are
|
|
considered to be alphabetic.
|
|
|
|
==== Math Block ====
|
|
|
|
A math block is composed of a series of lines representing a mathematical
|
|
formula. It is rendered in HTML using the [[https://www.mathjax.org/|MathJax engine]].
|
|
|
|
{{{vimwiki
|
|
{{$%align%
|
|
\sum_i a_i^2 &= 1 + 1 \\
|
|
&= 2.
|
|
}}$
|
|
}}}
|
|
|
|
===== Syntax =====
|
|
|
|
A *math block* is composed of a [[#beginning math block line|beginning math block line]],
|
|
one or more [[#math block line|math block lines]], and an [[#ending math block line|ending math block line]].
|
|
|
|
A *beginning math block line* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. Zero or more [[#whitespace character|whitespace characters]]
|
|
3. The sequence `{{$`
|
|
4. An optional [[#math environment|math environment]]
|
|
5. Zero or more [[#whitespace character|whitespace characters]]
|
|
6. A [[#line ending|line ending]] or end of input
|
|
|
|
A *math environment* is represented by the following:
|
|
1. The percent sign (`U+0025` aka `%`)
|
|
2. One or more characters that are not the percent sign or [[#line ending|line ending]]
|
|
3. The percent sign (`U+0025` aka `%`)
|
|
|
|
A *math block line* is a line found after a [[#beginning math block line|beginning math block line]]
|
|
and before an [[#ending math block line|ending math block line]] and is
|
|
comprised of zero or more characters followed by a [[#line ending|line ending]].
|
|
|
|
An *ending math block line* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. Zero or more [[#whitespace character|whitespace characters]]
|
|
3. The sequence `}}$`
|
|
4. Zero or more [[#whitespace character|whitespace characters]]
|
|
5. A [[#line ending|line ending]] or end of input
|
|
|
|
==== Paragraph ====
|
|
|
|
A paragraph is composed of a series of lines representing some content. It
|
|
mirrors [[https://developer.mozilla.org/en-US/docs/Web/HTML/Element/p|HTML Paragraph]].
|
|
|
|
{{{vimwiki
|
|
Some paragraph containing
|
|
multiple lines including
|
|
*bold* and [[links]].
|
|
}}}
|
|
|
|
===== Syntax =====
|
|
|
|
A *paragraph* is composed of one or more [[#paragraph line|paragraph lines]].
|
|
|
|
A *paragraph line* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. Has no [[#whitespace character|whitespace]] indentation
|
|
3. Is not any of the following:
|
|
* [[#header|header]]
|
|
* [[#definition list|definition list]]
|
|
* [[#list|list]]
|
|
* [[#table|table]]
|
|
* [[#preformatted text|preformatted text]]
|
|
* [[#math block|math block]]
|
|
* [[#blank link|blank line]]
|
|
* [[#blockquote|blockquote]]
|
|
* [[#divider|divider]]
|
|
* [[#placeholder|placeholder]]
|
|
4. One or more [[#inline components|inline components]]
|
|
5. A [[#line ending|line ending]] or end of input
|
|
|
|
TODO ... should we combine [[#non-blank line|non-blank line]] with paragraph
|
|
by having a paragraph trim all leading [[#whitespace character|whitespace]]?
|
|
|
|
==== Placeholder ====
|
|
|
|
A placeholder is composed of an identifier and some information. Its purpose
|
|
is to provide metadata for use in rendering vimwiki to HTML and populating
|
|
portions of the HTML template used with a vimwiki page.
|
|
|
|
{{{vimwiki
|
|
%title Some title
|
|
%nohtml
|
|
%template my_template
|
|
%date 2020-12-23
|
|
}}}
|
|
|
|
===== Syntax =====
|
|
|
|
|
|
A *placeholder* is represented by one of the following:
|
|
* A [[#title placeholder|title placeholder]]
|
|
* A [[#nohtml placeholder|nohtml placeholder]]
|
|
* A [[#template placeholder|template placeholder]]
|
|
* A [[#date placeholder|date placeholder]]
|
|
|
|
A *title placeholder* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. A percent sign (`U+0025` aka `%`)
|
|
3. The sequence `title`
|
|
4. One or more [[#whitespace character|whitespace characters]]
|
|
5. Any sequence of [[#whitespace character|whitespace characters]] and
|
|
non-whitespace characters leading up to a [[#line ending|line ending]]
|
|
6. A [[#line ending|line ending]] or end of input
|
|
|
|
A *nohtml placeholder* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. A percent sign (`U+0025` aka `%`)
|
|
3. The sequence `nohtml`
|
|
4. A [[#line ending|line ending]] or end of input
|
|
|
|
A *template placeholder* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. A percent sign (`U+0025` aka `%`)
|
|
3. The sequence `template`
|
|
4. One or more [[#whitespace character|whitespace characters]]
|
|
5. Any sequence of [[#whitespace character|whitespace characters]] and
|
|
non-whitespace characters leading up to a [[#line ending|line ending]]
|
|
6. A [[#line ending|line ending]] or end of input
|
|
|
|
A *date placeholder* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. A percent sign (`U+0025` aka `%`)
|
|
3. The sequence `date`
|
|
4. One or more [[#whitespace character|whitespace characters]]
|
|
5. A date string in the format `YYYY-MM-DD` where `YYYY` symbolizes a
|
|
four-digit year (e.g. `1990`), `MM` symbolizes a two-digit month (e.g.
|
|
`04`), and `DD` symbolizes a two-digit day (e.g. `23`)
|
|
6. A [[#line ending|line ending]] or end of input
|
|
|
|
==== Preformatted Text ====
|
|
|
|
A preformatted text block is composed of a series of lines representing some
|
|
content, usually related to a programming language. It mirrors
|
|
[[https://developer.mozilla.org/en-US/docs/Web/HTML/Element/pre|HTML Preformatted Text]].
|
|
|
|
{{{vimwiki
|
|
{{{rust
|
|
fn my_func() -> u32 {
|
|
1 + 2
|
|
}
|
|
\}}}
|
|
}}}
|
|
|
|
TODO ... wrapping a preformatted text block with another preformatted text
|
|
isn't possible right now due to the use of `}}}` matching. We would
|
|
need some sort of escape like `\}}}` or `\{{{` that could be used to
|
|
avoid matching legit syntax but still provide the literal text
|
|
upon rendering.
|
|
|
|
TODO ... pandoc supports more than one metadata key/value pair via spaces
|
|
such as `{{{class="something" style="else"`. Wasn't clear if vimwiki
|
|
itself supports multiple metadata. *vimwiki-server* supports this
|
|
through semicolons because it was easier `{{{class="something";style="else"`
|
|
but isn't married to that approach
|
|
|
|
===== Syntax =====
|
|
|
|
A *preformatted text* is composed of a [[#beginning preformatted text line|beginning preformatted text line]],
|
|
one or more [[#preformatted text line|preformatted text lines]], and an [[#ending preformatted text line|ending preformatted text line]].
|
|
|
|
A *beginning preformatted text line* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. Zero or more [[#whitespace character|whitespace characters]]
|
|
3. The sequence `{{{`
|
|
4. An optional [[#preformatted language identifier|preformatted language identifier]]
|
|
5. An optional [[#preformatted metadata list|preformatted metadata list]]
|
|
6. Zero or more [[#whitespace character|whitespace characters]]
|
|
7. A [[#line ending|line ending]] or end of input
|
|
|
|
A *preformatted language identifier* is represented by the following:
|
|
1. One or more characters that are not equals sign (`U+003D` aka `=`)
|
|
2. An optional semicolon (`U+003B` aka `;`)
|
|
|
|
A *preformatted metadata list* is composed of one or more
|
|
[[#preformatted metadata list items|preformatted metadata list items]] separated by semicolons (`U+003B` aka `;`).
|
|
|
|
A *preformatted metadata list item* is represented by the following:
|
|
1. One or more characters leading up to an equals sign (`U+003D` aka `=`),
|
|
not including a [[#line ending|line ending]]
|
|
2. An equals sign (`U+003D` aka `=`)
|
|
3. A quotation mark (`U+0022` aka `"`)
|
|
4. One or more characters leading up to a quotation mark (`U+0022` aka `"`)
|
|
5. A quotation mark (`U+0022` aka `"`)
|
|
|
|
A *preformatted text line* is a line found after a [[#beginning preformatted text line|beginning preformatted text line]]
|
|
and before an [[#ending preformatted text line|ending preformatted text line]] and is
|
|
comprised of zero or more characters followed by a [[#line ending|line ending]].
|
|
|
|
An *ending preformatted text line* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. Zero or more [[#whitespace character|whitespace characters]]
|
|
3. The sequence `}}}`
|
|
4. Zero or more [[#whitespace character|whitespace characters]]
|
|
5. A [[#line ending|line ending]] or end of input
|
|
|
|
==== Table ====
|
|
|
|
A table is composed of a series of rows containing various other components. It
|
|
mirrors [[https://developer.mozilla.org/en-US/docs/Web/HTML/Element/table|HTML Table]].
|
|
|
|
{{{vimwiki
|
|
| Year | Temperature (low) | Temperature (high) | Temperature (avg) |
|
|
|------|-------------------|--------------------------|-------------------|
|
|
| 1990 | *50* degrees | 90 according to [[link]] | 72 |
|
|
| \/ | 45 degrees | > | 80 |
|
|
| \/ | \/ | > | 60 |
|
|
| 2000 | > | > | > |
|
|
}}}
|
|
|
|
===== Syntax =====
|
|
|
|
A *table* is composed of one or more [[#row|rows]] with the indentation of
|
|
the first row indicating whether the table is centered (is indented) or not.
|
|
|
|
A *row* is represented by one of the following:
|
|
* A [[#divider row|divider row]]
|
|
* A [[#content row|content row]]
|
|
|
|
A *divider row* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. Zero or more [[#whitespace character|whitespace characters]]
|
|
3. A sequence of pairs comprised of a [[#cell boundary|cell boundary]] and
|
|
one or more hyphens (`U+002D` aka `-`)
|
|
4. A final [[#cell boundary|cell boundary]]
|
|
5. A [[#line ending|line ending]] or end of input
|
|
|
|
A *content row* is represented by the following:
|
|
1. Starts at the beginning of a line
|
|
2. Zero or more [[#whitespace character|whitespace characters]]
|
|
3. A sequence of pairs comprised of a [[#cell boundary|cell boundary]] and a [[#cell|cell]]
|
|
4. A final [[#cell boundary|cell boundary]]
|
|
5. A [[#line ending|line ending]] or end of input
|
|
|
|
A *cell* is represented by one of the following:
|
|
* A [[#span above cell|span above cell]]
|
|
* A [[#span left cell|span left cell]]
|
|
* A [[#content cell|content cell]]
|
|
|
|
A *span above cell* is represented by the following:
|
|
1. Zero or more [[#whitespace character|whitespace characters]]
|
|
2. Sequence `\/`
|
|
3. Zero or more [[#whitespace character|whitespace characters]]
|
|
|
|
A *span left cell* is represented by the following:
|
|
1. Zero or more [[#whitespace character|whitespace characters]]
|
|
2. Sequence `>`
|
|
3. Zero or more [[#whitespace character|whitespace characters]]
|
|
|
|
A *content cell* is represented by the following:
|
|
1. Zero or more [[#whitespace character|whitespace characters]]
|
|
2. One or more [[#inline components|inline components]] not comprised of `|`
|
|
3. Zero or more [[#whitespace character|whitespace characters]]
|
|
|
|
A *cell boundary* is represented by the pipe character (`U+007C` aka `|`).
|
|
|
|
==== Non-blank Line ====
|
|
|
|
A non-blank line is a single line that is not a paragraph. It is used to
|
|
capture text not represented by any other component.
|
|
|
|
{{{vimwiki
|
|
Some other text containing *bold* and [[links]].
|
|
}}}
|
|
|
|
===== Syntax =====
|
|
|
|
A *non-blank line* is represented by the following:
|
|
1. Between one and three [[#whitespace character|whitespace characters]]
|
|
2. One or more [[#inline components|inline components]]
|
|
3. A [[#line ending|line ending]] or end of input
|
|
|
|
TODO ... should we combine [[#non-blank line|non-blank line]] with paragraph
|
|
by having a paragraph trim all leading [[#whitespace character|whitespace]]?
|
|
|
|
=== Inline Components ===
|
|
|
|
The vimwiki language also has a variety of syntax that can be used within a
|
|
line on a page. These are referred to as *inline components* and can be found
|
|
within a variety of [[#block components|block components]] as well as nested
|
|
within other inline components.
|
|
|
|
==== Math Inline ====
|
|
|
|
A math inline component is composed of a single-line formula.
|
|
Like its big brother, the [[#math block|math block]], it is rendered in HTML
|
|
using the [[https://www.mathjax.org/|MathJax engine]].
|
|
|
|
{{{vimwiki
|
|
$ \sum_i a_i^2 = 1 $
|
|
}}}
|
|
|
|
TODO ... is there a way to escape the `$` used to mark the beginning and end
|
|
of an inline math component? Is `$` even a concern within an
|
|
inline formula? If it is, maybe an escape sequence of `\$` would
|
|
be applicable.
|
|
|
|
===== Syntax =====
|
|
|
|
An *inline math* component is represented by the following:
|
|
1. A dollar sign (`U+0024` aka `$`)
|
|
2. One or more characters that are not a dollar sign or [[#line ending|line ending]]
|
|
3. A dollar sign (`U+0024` aka `$`)
|
|
|
|
*Extra Notes*: The formula within the inline component is trimmed to remove all
|
|
leading and trailing [[#whitespace character|whitespace characters]].
|
|
|
|
==== Tags ====
|
|
|
|
A tags component is composed of a series of individual tag elements. It is used
|
|
both to mark various places within a page as well as act as an [[#anchor|anchor]].
|
|
|
|
{{{vimwiki
|
|
:tag-1:tag-2:
|
|
}}}
|
|
|
|
===== Syntax =====
|
|
|
|
A *tags* component is represented by the following:
|
|
1. A [[#tag separator|tag separator]]
|
|
2. A sequence of [[#tag|tag]] separated by [[#tag separator|tag separator]]
|
|
3. A [[#tag separator|tag separator]]
|
|
|
|
A *tag* is represented by one or more characters that are not a colon,
|
|
[[#whitespace character|whitespace]], or [[#line ending|line ending]]
|
|
|
|
A *tag separator* is represented by a colon (`U+003A` aka `:`).
|
|
|
|
TODO ... should tags with whitespace be allowed?
|
|
|
|
==== Link ====
|
|
|
|
A link is a crucial inline component of vimwiki and is able to connect pages
|
|
with each other as well as external wikis and resources.
|
|
|
|
{{{vimwiki
|
|
[[other page|link to another page]]
|
|
[[wiki1:page|link to page in another wiki]]
|
|
[[#some#anchor|link to another location in same page]]
|
|
[[diary:2020-12-23|link to diary entry]]
|
|
{{https://example.com/img.jpg|Transclusion to pull in image}}
|
|
[[https://example.com|{{https://example.com/img.jpg}}]]
|
|
}}}
|
|
|
|
===== Syntax =====
|
|
|
|
A *link* is represented by one of the following:
|
|
* A [[#wiki link|wiki link]]
|
|
* An [[#interwiki link|interwiki link]]
|
|
* A [[#diary link|diary link]]
|
|
* An [[#external file link|external file link]]
|
|
* A [[#raw link|raw link]]
|
|
* A [[#transclusion link|transclusion link]]
|
|
|
|
A *wiki link* is represented by the following:
|
|
1. A [[#link start seq|link start seq]]
|
|
2. At least one of the following (in order):
|
|
1. An optional [[#link path|link path]]
|
|
2. An optional [[#link anchor|link anchor]]
|
|
3. An optional [[#link inner separator|link inner separator]] and [[#link description|link description]]
|
|
4. A [[#link end seq|link end seq]]
|
|
|
|
An *interwiki link* is represented by one of the following:
|
|
* An [[#indexed interwiki link|indexed interwiki link]]
|
|
* An [[#named interwiki link|named interwiki link]]
|
|
|
|
An *indexed interwiki link* is represented by the following:
|
|
1. A [[#link start seq|link start seq]]
|
|
2. The sequence `wiki`
|
|
3. One or more digits (`0-9`), but must be `0` or higher
|
|
4. A colon (`U+003A` aka `:`)
|
|
5. A [[#link path|link path]]
|
|
6. An optional [[#link anchor|link anchor]]
|
|
7. An optional [[#link inner separator|link inner separator]] and [[#link description|link description]]
|
|
8. A [[#link end seq|link end seq]]
|
|
|
|
A *named interwiki link* is represented by the following:
|
|
1. A [[#link start seq|link start seq]]
|
|
2. The sequence `wn.`
|
|
3. One or more characters that are not a colon (`U+003A` aka `:`) or [[#line ending|line ending]]
|
|
4. A colon (`U+003A` aka `:`)
|
|
5. A [[#link path|link path]]
|
|
6. An optional [[#link anchor|link anchor]]
|
|
7. An optional [[#link inner separator|link inner separator]] and [[#link description|link description]]
|
|
8. A [[#link end seq|link end seq]]
|
|
|
|
An *external file link* is represented by the following:
|
|
1. A [[#link start seq|link start seq]]
|
|
2. A [[#link uri|link uri]] whose schema is `local` or `file` or has no schema
|
|
and starts with `//` for an absolute file path
|
|
3. A [[#link path|link path]]
|
|
4. An optional [[#link inner separator|link inner separator]] and [[#link description|link description]]
|
|
5. A [[#link end seq|link end seq]]
|
|
|
|
A *raw link* is represented by a [[#link uri|link uri]] not found within
|
|
another link type.
|
|
|
|
A *transclusion link* is represented by the following:
|
|
1. The sequence `{{`
|
|
2. A [[#link uri|link uri]]
|
|
3. An optional [[#link inner separator|link inner separator]] and [[#link description|link description]]
|
|
4. An optional sequence of [[#link key value pair|link key value pairs]],
|
|
each separated by [[#link inner separator|link inner separator]]
|
|
4. The sequence `}}`
|
|
|
|
A *link key value pair* is represented by the following:
|
|
1. One or more characters that are not a pipe symbol (`U+007C`
|
|
aka `|`), equals sign (`U+003D` aka `=`), `}}`, or [[#line ending|line ending]]
|
|
2. An equals sign (`U+003D` aka `=`)
|
|
3. A quotation mark (`U+0022` aka `"`)
|
|
4. One or more characters that are not a pipe symbol (`U+007C`
|
|
aka `|`), quotation mark (`U+0022` aka `"`), `}}`, or [[#line ending|line ending]]
|
|
5. A quotation mark (`U+0022` aka `"`)
|
|
|
|
A *link path* is represented by the following:
|
|
1. Does not start with a [[#link anchor prefix|link anchor prefix]]
|
|
2. One or more characters that are not a [[#link anchor prefix|link anchor prefix]],
|
|
[[#link inner separator|link inner separator]], [[#link end seq|link end seq]], or [[#line ending|line ending]]
|
|
|
|
A *link description* is represented by one of the following:
|
|
* A [[#link uri|link uri]]
|
|
* One or more characters that are not a [[#link end seq|link end seq]] or [[#line ending|line ending]]
|
|
|
|
A *link anchor* is represented by a series of pairs, each comprised of
|
|
a [[#link anchor prefix|link anchor prefix]] and a [[#link anchor component|link anchor component]]
|
|
|
|
A *link anchor component* is represented by one or more characters that are
|
|
not a [[#link anchor prefix|link anchor prefix]], [[#link inner separator|link inner separator]],
|
|
[[#link end seq|link end seq]], or [[#line ending|line ending]]
|
|
|
|
A *link anchor prefix* is represented by a pound symbol (`U+0023` aka `#`).
|
|
|
|
A *link inner separator* is represented by a pipe symbol (`U+007C` aka `|`).
|
|
|
|
A *link start seq* is represented by `[[`.
|
|
|
|
A *link end seq* is represented by `]]`.
|
|
|
|
A *link uri* is represented by the following:
|
|
1. Starts with `www.`, `//`, or a [[#link uri scheme|link uri scheme]]
|
|
a. If starting with `www.`, we add a virtual prefix of `https://` going forward
|
|
b. If starting with `//`, we add a virtual prefix of `file:/` going forward
|
|
2. One or more characters that are not [[#whitespace character|whitespace characters]]
|
|
or [[#line ending|line ending]]
|
|
|
|
A *link uri scheme* is represented by a series of alphanumeric characters
|
|
(`a-z`, `A-Z`, `0-9`) as well as plus (`U+002B` aka `+`), period (`U+002E`
|
|
aka `.`), and hyphen (`U+002D` aka `-`). The scheme is terminated by a
|
|
colon (`U+003A` aka `:`).
|
|
|
|
*Extra Notes*: Additional validation should be done to ensure that a
|
|
[[#line uri|link uri]] properly adheres to [[https://tools.ietf.org/html/rfc3986|RFC3986]].
|
|
|
|
==== Decorated Text ====
|
|
|
|
Decorated text supports a variety of markups across [[#link|links]],
|
|
[[#keyword|keywords]], and [[#text|text]]. It mirrors these different HTML elements:
|
|
* [[https://developer.mozilla.org/en-US/docs/Web/HTML/Element/strong|<strong>]]
|
|
* [[https://developer.mozilla.org/en-US/docs/Web/HTML/Element/em|<em>]]
|
|
* [[https://developer.mozilla.org/en-US/docs/Web/HTML/Element/s|<s>]]
|
|
* [[https://developer.mozilla.org/en-US/docs/Web/HTML/Element/code|<code>]]
|
|
* [[https://developer.mozilla.org/en-US/docs/Web/HTML/Element/sup|<sup>]]
|
|
* [[https://developer.mozilla.org/en-US/docs/Web/HTML/Element/sub|<sub>]]
|
|
|
|
{{{vimwiki
|
|
*bold*
|
|
_italic_
|
|
*_bold italic_*
|
|
_*bold italic*_
|
|
~~strikeout~~
|
|
`code`
|
|
^superscript^
|
|
,,superscript,,
|
|
}}}
|
|
|
|
===== Syntax =====
|
|
|
|
*Decorated text* is represented by one of the following:
|
|
* [[#bold text|Bold text]]
|
|
* [[#italic text|Italic text]]
|
|
* [[#bold italic text|Bold italic text]]
|
|
* [[#strikeout text|Strikeout text]]
|
|
* [[#code text|Code text]]
|
|
* [[#superscript text|Superscript text]]
|
|
* [[#subscript text|Subscript text]]
|
|
|
|
*Bold text* is represented by the following:
|
|
1. An asterisk (`U+002A` aka `*`)
|
|
2. One or more [[#link|links]], [[#keyword|keywords]], or [[#text|text]] until
|
|
an asterisk (`U+002A` aka `*`) is encountered
|
|
3. An asterisk (`U+002A` aka `*`)
|
|
|
|
*Italic text* is represented by the following:
|
|
1. An underscore (`U+005F` aka `_`)
|
|
2. One or more [[#link|links]], [[#keyword|keywords]], or [[#text|text]] until
|
|
an underscore (`U+005F` aka `_`) is encountered
|
|
3. An underscore (`U+005F` aka `_`)
|
|
|
|
*Bold Italic text* is represented by either of the following:
|
|
* Form 1
|
|
1. Sequence `*_`
|
|
2. One or more [[#link|links]], [[#keyword|keywords]], or [[#text|text]] until
|
|
an `_*` is encountered
|
|
3. Sequence `_*`
|
|
* Form 2
|
|
1. Sequence `_*`
|
|
2. One or more [[#link|links]], [[#keyword|keywords]], or [[#text|text]] until
|
|
an `*_` is encountered
|
|
3. Sequence `*_`
|
|
|
|
*Strikeout text* is represented by the following:
|
|
1. Two tilde (`U+007E` aka `~`)
|
|
2. One or more [[#link|links]], [[#keyword|keywords]], or [[#text|text]] until
|
|
two tilde (`U+007E` aka `~`) are encountered
|
|
3. Two tilde (`U+007E` aka `~`)
|
|
|
|
*Code text* is represented by the following:
|
|
1. A backtick or grave accent (`U+0060`)
|
|
2. Any character other than a backtick or [[#line ending|line ending]]
|
|
3. A backtick or grave accent (`U+0060`)
|
|
|
|
*Superscript text* is represented by the following:
|
|
1. A carrot or circumflex accent (`U+005E` aka `^`)
|
|
2. One or more [[#link|links]], [[#keyword|keywords]], or [[#text|text]] until
|
|
a carrot or circumflex accent (`U+005E` aka `^`) is encountered
|
|
3. A carrot or circumflex accent (`U+005E` aka `^`)
|
|
|
|
*Superscript text* is represented by the following:
|
|
1. Two commas (`U+002C` aka `,`)
|
|
2. One or more [[#link|links]], [[#keyword|keywords]], or [[#text|text]] until
|
|
two commas (`U+002C` aka `,`) are encountered
|
|
3. Two commas (`U+002C` aka `,`)
|
|
|
|
TODO ... Cannot escape a backtick within code, is this something that we'd
|
|
expect to support?
|
|
|
|
==== Keyword ====
|
|
|
|
Keywords are specific, case-sensitive words that have an alternative
|
|
highlighting within vim, but serve no other special purpose.
|
|
|
|
===== Syntax =====
|
|
|
|
A *keyword* is represented as one of the following:
|
|
* `DONE`
|
|
* `FIXED`
|
|
* `FIXME`
|
|
* `STARTED`
|
|
* `TODO`
|
|
* `XXX`
|
|
|
|
==== Text ====
|
|
|
|
Text is a plain series of characters that have no special stylings applied
|
|
directly, but can be included in other [[#inline components|inline components]].
|
|
|
|
===== Syntax =====
|
|
|
|
A *text* is represented as one or more characters until any of the following
|
|
is encountered:
|
|
* [[#inline math|inline math]]
|
|
* [[#tags|tags]]
|
|
* [[#link|link]]
|
|
* [[#decorated text|decorated text]]
|
|
* [[#keyword|keyword]]
|
|
* [[#line ending|line ending]]
|
|
|
|
=== Comments ===
|
|
|
|
Separately from [[#block components|block components]] and [[#inline components|inline components]],
|
|
comments are another component available within vimwiki. There are two
|
|
classifications:
|
|
|
|
1. Line comment in the form of `%%CONTENT`
|
|
2. Multi-line comment in the form of `%%+CONTENT++%`
|
|
|
|
TODO ... if comments are removed from vimwiki before all other syntax is
|
|
evaluated, we need to provide an escape mechanism, otherwise the above syntax
|
|
within inline code will be removed when rendering to HTML, parsing, etc.
|
|
|
|
TODO ... while vimwiki, pandoc, and vimwiki server do not yet offer this, should
|
|
we consider an escape sequence to enable leaving a comment within a vimwiki
|
|
file as normal text. Something like `\%%` would leave as `%%` and `\%%+` would
|
|
leave as `%%+`. Could use the same conceal vim syntax as with bold and other
|
|
decorations to head the preceding backslash.
|
|
|
|
==== Line Comment Syntax ====
|
|
|
|
1. A *line comment* is represented by the following:
|
|
1. The sequence `%%`
|
|
2. Any character until [[#line ending|line ending]]
|
|
|
|
*Extra Notes*: A line comment does not consume a [[#line ending|line ending]],
|
|
only the characters leading up to one. If a line comment is at the beginning of
|
|
a line, it will leave a blank line in its place.
|
|
|
|
TODO ... today, the documentation describes a line comment as starting at the
|
|
beginning of a line while pandoc and vimwiki-server support a line comment at
|
|
any position in a line. What stance do we want to take here? Would we need a
|
|
compatibility layer for people who might have leveraged %% within the middle of
|
|
a line not expecting it to be a comment? Or should this be one of the advantages
|
|
of finally defining a specification in that we can avoid hard backwards
|
|
compatibility?
|
|
|
|
==== Multi-line Comment Syntax ====
|
|
|
|
1. A *multi-line comment* is represented by the following:
|
|
1. The sequence `%%+`
|
|
2. Any character until the sequence `+%%`
|
|
3. The sequence `+%%`
|
|
|
|
*Extra Notes*: A multi-line comment consumes all characters - including
|
|
[[#line ending|line ending]] - between the surrounding character sequences. It
|
|
can be used to join content in separate lines together. See example below.
|
|
|
|
{{{vimwiki
|
|
first line%%+
|
|
+%%second line
|
|
|
|
would become
|
|
|
|
first linesecond line
|
|
}}}
|
|
|
|
== Parser Details ==
|
|
|
|
When building a parser for the vimwiki language, certain components may overlap
|
|
in the text that they can match. This means that the order in which components
|
|
are evaluated can affect how a page is perceived.
|
|
|
|
Additionally, the inclusion of [[#comments|comments]] further complicates the
|
|
process of parsing a file. Comments should remove any content from a file and
|
|
multi-line comments can remove [[#line ending|line ending]] characters.
|
|
|
|
To that end, a two-pass parser is required to support properly extracting
|
|
comments prior to parsing the full vimwiki syntax:
|
|
1. Parse all comments and remove from input
|
|
2. Parse a page that is full of [[#block components|block components]]
|
|
|
|
{{{
|
|
Comment =
|
|
| Multi Line Comment
|
|
| Line Comment
|
|
Page = (Block Component)+
|
|
Block Component =
|
|
| Header
|
|
| Definition List
|
|
| List
|
|
| Table
|
|
| Math Block
|
|
| Blank Line
|
|
| Blockquote
|
|
| Divider
|
|
| Placeholder
|
|
| Paragraph
|
|
| Non-blank Line
|
|
Inline Component =
|
|
| Math Inline
|
|
| Tags
|
|
| Link
|
|
| Decorated Text
|
|
| Keyword
|
|
| Text
|
|
}}}
|