SkillAgentSearch skills...

Unist

Universal Syntax Tree used by @unifiedjs

Install / Use

/learn @syntax-tree/Unist
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

![unist][logo]

Universal Syntax Tree.


unist is a specification for syntax trees. It has a big [ecosystem of utilities][list-of-utilities] in JavaScript for working with these trees. It’s implemented by several other specifications.

This document may not be released. See [releases][] for released documents. The latest released version is [3.0.0][release].

Contents

Intro

This document defines a general-purpose format for syntax trees. Development of unist started in July 2015. This specification is written in a [Web IDL][webidl]-like grammar.

Syntax tree

Syntax trees are representations of source code or even natural language. These trees are abstractions that make it possible to analyze, transform, and generate code.

Syntax trees [come in two flavors][abstract-vs-concrete-trees]:

  • concrete syntax trees: structures that represent every detail (such as white-space in white-space insensitive languages)
  • abstract syntax trees: structures that only represent details relating to the syntactic structure of code (such as ignoring whether a double or single quote was used in languages that support both, such as JavaScript).

This specification can express both abstract and concrete syntax trees.

Where this specification fits

unist is not intended to be self-sufficient. Instead, it is expected that other specifications implement unist and extend it to express language specific nodes. For example, see projects such as [hast][] (for HTML), [nlcst][] (for natural language), [mdast][] (for Markdown), and [xast][] (for XML).

unist relates to [JSON][] in that compliant syntax trees can be expressed completely in JSON. However, unist is not limited to JSON and can be expressed in other data formats, such as [XML][].

unist relates to [JavaScript][] in that it has a rich [ecosystem of utilities][list-of-utilities] for working with compliant syntax trees in JavaScript. The five most used utilities combined are downloaded thirty million times each month. However, unist is not limited to JavaScript and can be used in other programming languages.

unist relates to the [unified][], [remark][], [rehype][], and [retext][] projects in that unist syntax trees are used throughout their ecosystems.

unist relates to the [vfile][] project in that it accepts unist nodes for its message store, and that vfile can be a source [file][term-file] of a syntax tree.

Types

If you are using TypeScript, you can use the unist types by installing them with npm:

npm install @types/unist

Nodes

Syntactic units in unist syntax trees are called nodes, and implement the [Node][dfn-node] interface.

Node

interface Node {
  type: string
  data: Data?
  position: Position?
}

The type field is a non-empty string representing the variant of a node. This field can be used to determine the [type][term-type] a node implements.

The data field represents information from the ecosystem. The value of the data field implements the [Data][dfn-data] interface.

The position field represents the location of a node in a source document. The value of the position field implements the [Position][dfn-position] interface. The position field must not be present if a node is [generated][term-generated].

Specifications implementing unist are encouraged to define more fields. Ecosystems can define fields on [Data][dfn-data].

Any value in unist must be expressible in JSON values: string, number, object, array, true, false, or null. This means that the syntax tree should be able to be converted to and from JSON and produce the same tree. For example, in JavaScript, a tree can be passed through JSON.parse(JSON.stringify(tree)) and result in the same tree.

Position

interface Position {
  start: Point
  end: Point
}

Position represents the location of a node in a source [file][term-file].

The start field of Position represents the place of the first character of the parsed source region. The end field of Position represents the place of the first character after the parsed source region, whether it exists or not. The value of the start and end fields implement the [Point][dfn-point] interface.

If the syntactic unit represented by a node is not present in the source [file][term-file] at the time of parsing, the node is said to be [generated][term-generated] and it must not have positional information.

For example, if the following value was represented as unist:

alpha
bravo

…the first word (alpha) would start at line 1, column 1, offset 0, and end at line 1, column 6, offset 5. The line feed would start at line 1, column 6, offset 5, and end at line 2, column 1, offset 6. The last word (bravo) would start at line 2, column 1, offset 6, and end at line 2, column 6, offset 11.

Point

interface Point {
  line: number >= 1
  column: number >= 1
  offset: number >= 0?
}

Point represents one place in a source [file][term-file].

The line field (1-indexed integer) represents a line in a source file. The column field (1-indexed integer) represents a column in a source file. The offset field (0-indexed integer) represents a character in a source file.

The term character means a (UTF-16) code unit which is defined in the [Web IDL][webidl] specification.

Data

interface Data { }

Data represents information associated by the ecosystem with the node.

This space is guaranteed to never be specified by unist or specifications implementing unist.

Parent

interface Parent <: Node {
  children: [Node]
}

Nodes containing other nodes (said to be [children][term-child]) extend the abstract interface Parent ([Node][dfn-node]).

The children field is a list representing the children of a node.

Literal

interface Literal <: Node {
  value: any
}

Nodes containing a value extend the abstract interface Literal ([Node][dfn-node]).

The value field can contain any value.

Glossary

Tree

A tree is a node and all of its [descendants][term-descendant] (if any).

Child

Node X is child of node Y, if Y’s children include X.

Parent

Node X is parent of node Y, if Y is a [child][term-child] of X.

Index

The index of a [child][term-child] is its number of preceding [siblings][term-sibling], or 0 if it has none.

Sibling

Node X is a sibling of node Y, if X and Y have the same [parent][term-parent] (if any).

The previous sibling of a [child][term-child] is its sibling at its [index][term-index] minus 1.

The next sibling of a [child][term-child] is its sibling at its [index][term-index] plus 1.

Root

The root of a node is itself, if without [parent][term-parent], or the root of its [parent][term-parent].

The root of a [tree][term-tree] is any node in that [tree][term-tree] without [parent][term-parent].

Descendant

Node X is descendant of node Y, if X is a [child][term-child] of Y, or if X is a [child][term-child] of node Z that is a descendant of Y.

An inclusive descendant is a node or one of its descendants.

Ancestor

Node X is an ancestor of node Y, if Y is a [descendant][term-descendant] of X.

An inclusive ancestor is a node or one of its ancestors.

Head

The head of a node is its first [child][term-child] (if any).

Tail

The tail of a node is its last [child][term-child] (if any).

Leaf

A leaf is a node with no [children][term-child].

Branch

A branch is a node with one or more [children][term-child].

Generated

A node is generated if it does not have [positional information][term-positional-info].

Type

The type of a node is the value of its type field.

Positional information

The positional information of a node is the value of its position field.

File

A file is a source document that represents the original file that was parsed to produce the syntax tree. [Positional information][term-positional-info] represents the place of a node in this file. Files are provided by the host environment and not defined by unist.

For example, see projects such as [vfile][].

Preorder

In preorder (NLR) is [depth-first][traversal-depth] [tree traversal][traversal] that performs the following steps for each node N:

  1. N: visit N itself
  2. L: traverse [head][term-head] (then its next sibling, recursively moving forward until reaching tail)
  3. R: traverse [tail][term-tail]
Postorder

In postorder (LRN) is [depth-first][traversal-depth] [tree traversal][traversal] that performs the following steps for each node N:

  1. L: traverse [head][term-head] (then its next sibling, recursively moving forward until reaching tail)
  2. R: traverse [tail][term-tail]
  3. N: visit N itself
Enter

Enter is a step right before other steps performed on a given node N when [traversing][traversal] a tree.

For example, when performing preorder traversal, enter is the first step taken, right before visiting N itself.

Exit

Exit is a step right after other steps performed on a given no

Related Skills

View on GitHub
GitHub Stars991
CategoryDevelopment
Updated2d ago
Forks27

Security Score

85/100

Audited on Mar 30, 2026

No findings