Hast
Hypertext Abstract Syntax Tree format
Install / Use
/learn @syntax-tree/HastREADME
![hast][github-hast-logo]
Hypertext Abstract Syntax Tree format.
hast is a specification for representing [HTML][whatwg-html] (and embedded [SVG][w3c-svg] or [MathML][w3c-mathml]) as an abstract syntax tree. It implements the [unist][github-unist] spec.
This document may not be released.
See [releases][github-hast-releases] for released documents.
The latest released version is [2.4.0][github-hast-release].
Contents
- Introduction
- Types
- Nodes (abstract)
- Nodes
- Other types
- Glossary
- List of utilities
- Related HTML utilities
- References
- Security
- Related
- Contribute
- Acknowledgments
- License
Introduction
This document defines a format for representing hypertext as an [abstract syntax tree][github-unist-syntax-tree]. Development of hast started in April 2016 for [rehype][github-rehype]. This specification is written in a [Web IDL][whatwg-webidl]-like grammar.
Where this specification fits
hast extends [unist][github-unist], a format for syntax trees, to benefit from its [ecosystem of utilities][github-unist-utilities].
hast relates to [JavaScript][ecma-javascript] in that it has an [ecosystem of utilities][section-utilities] for working with compliant syntax trees in JavaScript. However, hast is not limited to JavaScript and can be used in other programming languages.
hast relates to the [unified][github-unified] and [rehype][github-rehype] projects in that hast syntax trees are used throughout their ecosystems.
Virtual DOM
The reason for introducing a new “virtual” DOM is primarily:
- The [DOM][whatwg-dom] is very heavy to implement outside of the browser, a lean and stripped down virtual DOM can be used everywhere
- Most virtual DOMs do not focus on ease of use in transformations
- Other virtual DOMs cannot represent the syntax of HTML in its entirety (think comments and document types)
- Neither the DOM nor virtual DOMs focus on positional information
Types
If you are using TypeScript, you can use the hast types by installing them with npm:
npm install @types/hast
Nodes (abstract)
Literal
interface Literal <: UnistLiteral {
value: string
}
Literal ([UnistLiteral][dfn-unist-literal]) represents a node in hast containing a value.
Parent
interface Parent <: UnistParent {
children: [Comment | Doctype | Element | Text]
}
Parent ([UnistParent][dfn-unist-parent]) represents a node in hast containing other nodes (said to be [children][term-child]).
Its content is limited to only other hast content.
Nodes
Comment
interface Comment <: Literal {
type: 'comment'
}
Comment ([Literal][dfn-literal]) represents a [Comment][concept-comment] ([[DOM]][whatwg-dom]).
For example, the following HTML:
<!--Charlie-->
Yields:
{type: 'comment', value: 'Charlie'}
Doctype
interface Doctype <: Node {
type: 'doctype'
}
Doctype ([Node][dfn-unist-node]) represents a [DocumentType][concept-documenttype] ([[DOM]][whatwg-dom]).
For example, the following HTML:
<!doctype html>
Yields:
{type: 'doctype'}
Element
interface Element <: Parent {
type: 'element'
tagName: string
properties: Properties
content: Root?
children: [Comment | Element | Text]
}
Element ([Parent][dfn-parent]) represents an [Element][concept-element] ([[DOM]][whatwg-dom]).
A tagName field must be present.
It represents the element’s [local name][concept-local-name]
([[DOM]][whatwg-dom]).
The properties field represents information associated with the element.
The value of the properties field implements the
[Properties][dfn-properties] interface.
If the tagName field is 'template',
a content field can be present.
The value of the content field implements the [Root][dfn-root] interface.
If the tagName field is 'template',
the element must be a [leaf][term-leaf].
If the tagName field is 'noscript',
its [children][term-child] should be represented as if
[scripting is disabled][concept-scripting] ([[HTML]][whatwg-html]).
For example, the following HTML:
<a href="https://alpha.com" class="bravo" download></a>
Yields:
{
type: 'element',
tagName: 'a',
properties: {
href: 'https://alpha.com',
className: ['bravo'],
download: true
},
children: []
}
Root
interface Root <: Parent {
type: 'root'
}
Root ([Parent][dfn-parent]) represents a document.
Root can be used as the [root][term-root] of a [tree][term-tree],
or as a value of the content field on a 'template'
[Element][dfn-element],
never as a [child][term-child].
Text
interface Text <: Literal {
type: 'text'
}
Text ([Literal][dfn-literal]) represents a [Text][concept-text] ([[DOM]][whatwg-dom]).
For example, the following HTML:
<span>Foxtrot</span>
Yields:
{
type: 'element',
tagName: 'span',
properties: {},
children: [{type: 'text', value: 'Foxtrot'}]
}
Other types
Properties
interface Properties {}
Properties represents information associated with an element.
Every field must be a [PropertyName][dfn-property-name] and every value a [PropertyValue][dfn-property-value].
PropertyName
typedef string PropertyName
Property names are keys on [Properties][dfn-properties] objects and reflect
HTML,
SVG,
ARIA,
XML,
XMLNS,
or XLink attribute names.
Often,
they have the same value as the corresponding attribute
(for example,
id is a property name reflecting the id attribute name),
but there are some notable differences.
These rules aren’t simple. Use [
hastscript][github-hastscript] (or [property-information][github-property-information] directly) to help.
The following rules are used to transform HTML attribute names to property names. These rules are based on [how ARIA is reflected in the DOM][concept-aria-reflection] ([[ARIA]][w3c-aria]), and differs from how some (older) HTML attributes are reflected in the DOM.
- any name referencing a combinations of multiple words
(such as “stroke miter limit”) becomes a camelcased property name
capitalizing each word boundary;
this includes combinations that are sometimes written as several words;
for example,
stroke-miterlimitbecomesstrokeMiterLimit,autocorrectbecomesautoCorrect, andallowfullscreenbecomesallowFullScreen - any name that can be hyphenated,
becomes a camelcased property name capitalizing each boundary;
for example,
“read-only” becomes
readOnly - compound words that are not used with spaces or hyphens are treated as a normal word and the previous rules apply; for example, “placeholder”, “strikethrough”, and “playback” stay the same
- acronyms in names are treated as a normal word and the previous rules apply;
for example,
itemidbecomeitemIdandbgcolorbecomesbgColor
Exceptions
Some jargon is seen as one word even though it may not be seen as such by
dictionaries.
For example,
nohref becomes noHref,
playsinline becomes playsInline,
and accept-charset becomes acceptCharset.
The HTML attributes class and for respectively become className and
htmlFor in alignment with the DOM.
No other attributes gain different names as properties,
other than a change in casing.
Notes
[property-information][github-property-information] lists all property names.
The property name rules differ from how HTML is reflected in the DOM for the following attributes:
<details> <summary>View list of differences</summary>charoffbecomescharOff(notchOff)charstayschar(does not becomech)relstaysrel(does not becomerelList)checkedstayschecked(does not becomedefaultChecked)mutedstaysmuted(does not becomedefaultMuted)valuestaysvalue(does not becomedefaultValue)selectedstaysselected(does not becomedefaultSelected)allowfullscreenbecomesallowFullScreen(notallowFullscreen)hreflangbecomeshrefLang, nothreflangautoplaybecomesautoPlay, notautoplayautocompletebecomesautoComplete(notautocomplete)autofocusbecomesautoFocus, notautofocusenctypebecomesencType, notenctypeformenctypebecomesformEncType(notformEnctype)vspacebecomesvSpace, notvspacehspacebecomeshSpace, nothspacelowsrcbecomeslowSrc, notlowsrc
PropertyValue
typedef any PropertyValue
Property values should reflect the data type determined by their property name.
For example,
the HTML <div hidden></div> has a hidden attribute,
which is reflected as a hidden property name set to the property value true,
and <input minlength="5">,
which has a minlength attribute,
is reflected as a minLength property name set to the property value 5.
In [JSON][ietf-json], the value
nullmust be treated as if the property was not included. In [JavaScript][ecma-javascript], bothnullandundefinedmust be similarly ignored.
The DOM has strict rules on how it coerces HTML to expected values, w
Related Skills
node-connect
342.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
342.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.7kCommit, push, and open a PR
Security Score
Audited on Mar 24, 2026
