SkillAgentSearch skills...

Htmlvoc

A RDF-based representation of the HTML Living Standard to express HTML-documents in RDF. HTML documents can thus be represented, queried, generated, validated, analysed, transformed and reused as semantic objects themselves.

Install / Use

/learn @floresbakker/Htmlvoc
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Specification 'htmlvoc'

This is the repository for htmlvoc, the semantic HTML-vocabulary. You're welcome to contribute!

Here you can find the official specification: https://floresbakker.github.io/htmlvoc/

Status

Stable, but no release yet. Work in progress together with the community group Semantic HTML-vocabulary. We aim for a preliminary release in the end of 2024, together with a draft report on the vocabulary.

Background

The mission of the community group Semantic HTML-vocabulary is to establish a draft standard for a RDF-based representation of the HTML living standard. HTML documents can thus be represented, queried, generated, validated, analysed, transformed and reused as semantic objects themselves. In addition, full provenance can be provided for a generated HTML-document, as every atom of the document can be described and semantically enriched, ex ante (RDF) and ex post (Rdfa). For instance, the originating algorithm that calculates a certain budget amount in a governmental HTML-document can be linked to the table cell containing the very value. HTML-documents have a wide variety of use and so has the HTML vocabulary. The HTML vocabulary can be used to generate 100% correct HTML or xHTML and to validate this. The HTML vocabulary can be used to model the front end of a website or application, whereas the logic behind the front end can be captured in SHACL Advanced Features, making for a full semantic representation and execution of digital infrastructure, without any vendor lock-in. An HTML-document can be generated with full compliance to laws and regulations, as these norms can be linked and applied while using the HTML vocabulary. With full provenance, an HTML-document can battle fake news and show realtime how certain sensitive data in the document (privacy, security) was derived. The community group will come up with a 0.1 draft specification. This will be input for a future working group within W3C. The community group can make use of the currently available draft specification as developed by the Dutch Ministry of Finance in a working prototype for the Dutch governmental budget cycle. By starting this community group, the Dutch Ministry wants to contribute to an open source based digital infrastructure.

Introduction

Let us go through the HTML vocabulary with an example of an ordinary HTML-document.

Example #1: an ordinary HTML-document with a table

<!DOCTYPE html>
<html>
<head>
    <title>Tutorial Document Example</title>
    <style>
        table {
            width: 70%;
            margin: 0 auto;
            border-collapse: collapse;
        }

        caption {
            text-align: left;
            font-weight: bold;
            padding: 10px;
            background-color: #f2f2f2; /* Light gray */
        }

        th, td {
            padding: 12px;
            text-align: center;
            border: 1px solid #ddd; /* Light gray border */
        }

        th {
            background-color: #4CAF50; /* Green */
            color: white;
        }
    </style>
</head>
<body>
    <table>
        <caption>Example table</caption>
        <thead>
            <tr>
                <th>banana</th>
                <th>orange</th>
                <th>apple</th>
            </tr>
        </thead>
        <tbody>
            <tr>
                <td>1</td>
                <td>2</td>
                <td>3</td>
            </tr>
            <tr>
                <td>a</td>
                <td>b</td>
                <td>c</td>
            </tr>
        </tbody>
    </table>
</body>
</html>

This table is rendered in a browser as follows:

An example of an HTML-document

Expressing the HTML-document in RDF

Now we can represent the very same document in <i>RDF</i> using the HTML-vocabulary. As it is very cumbersome to do so by hand, a <i>HTML2RDF</i> tool is available in this repository that will do exactly that for you. For further information on this tool and other neat tools, scroll down this Readme file.

prefix doc:  <http://www.example.org/document/> 
prefix html: <https://www.w3.org/html/model/def/> 
prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 

doc:1 a html:Document ; 
 rdf:_1 doc:1.0 ;
 rdf:_2 doc:2.0 . 

doc:1.0 a html:DocumentType ;
 html:documentTypeName "html" ;
 html:fragment "<!DOCTYPE html>"^^rdf:HTML .
 
doc:2.0 a html:Html ; 
 rdf:_1 doc:3.0 ; 
 rdf:_2 doc:4.0 .

doc:3.0 a html:Head ; 
 rdf:_1 doc:4.4 ; 
 rdf:_2 doc:5.4 .
 
doc:4.0 a html:Body ;  
 rdf:_1 doc:32.4 .
 
doc:32.4 a html:Table ; 
 rdf:_1 doc:33.8 ; 
 rdf:_2 doc:34.8 ; 
 rdf:_3 doc:41.8 . 

doc:33.8 a html:Caption ; 
 rdf:_1 doc:33.8.1 .

doc:33.8.1 a html:Text ; 
 html:fragment "Example table" . 

doc:34.8 a html:TableHeader ; 
 rdf:_1 doc:35.12 . 

doc:35.12 a html:Row ; 
 rdf:_1 doc:36.16 ; 
 rdf:_2 doc:37.16 ; 
 rdf:_3 doc:38.16 . 

doc:36.16 a html:HeaderCell ; 
 rdf:_1 doc:36.16.1 . 

doc:36.16.1 a html:Text ; 
 html:fragment "banana" .

doc:37.16 a html:HeaderCell ; 
 rdf:_1 doc:37.16.1 .

doc:37.16.1 a html:Text ; 
 html:fragment "orange" .

doc:38.16 a html:HeaderCell ; 
 rdf:_1 doc:38.16.1 .

doc:38.16.1 a html:Text ; 
 html:fragment "apple" .

doc:4.4 a html:Title ; 
 rdf:_1 doc:4.4.1 . 

doc:4.4.1 a html:Text ; 
 html:fragment "Tutorial Document Example" . 

doc:41.8 a html:TableBody ; 
 rdf:_1 doc:42.12 ; 
 rdf:_2 doc:47.12 . 

doc:42.12 a html:Row ; 
 rdf:_1 doc:43.16 ; 
 rdf:_2 doc:44.16 ; 
 rdf:_3 doc:45.16 .

doc:43.16 a html:DataCell ; 
 rdf:_1 doc:43.16.1 .

doc:43.16.1 a html:Text ;
 html:fragment "1" .

doc:44.16 a html:DataCell ; 
 rdf:_1 doc:44.16.1 . 

doc:44.16.1 a html:Text ; 
 html:fragment "2" . 

doc:45.16 a html:DataCell ; 
 rdf:_1 doc:45.16.1 . 

doc:45.16.1 a html:Text ; 
 html:fragment "3" . 

doc:47.12 a html:Row ; 
 rdf:_1 doc:48.16 ; 
 rdf:_2 doc:49.16 ; 
 rdf:_3 doc:50.16 . 

doc:48.16 a html:DataCell ;
 rdf:_1 doc:48.16.1 .

doc:48.16.1 a html:Text ;
 html:fragment "a" . 

doc:49.16 a html:DataCell ;
 rdf:_1 doc:49.16.1 . 

doc:49.16.1 a html:Text ; 
 html:fragment "b" . 

doc:5.4 a html:StyleSheet ; 
 rdf:_1 doc:5.4.1 . 

doc:5.4.1 a html:Text ; 
 html:fragment """\r able {\r width: 70%;\r margin: 0 auto;\r border-collapse: collapse;\r }\r \r caption {\r text-align: left;\r font-weight: bold;\r padding: 10px;\r background-color: #f2f2f2; /* Light gray */\r }\r \r th, td {\r padding: 12px;\r text-align: center;\r border: 1px solid #ddd; /* Light gray border */\r }\r \r th {\r background-color: #4CAF50; /* Green */\r color: white;\r }\r """ . 

doc:50.16 a html:DataCell ;
 rdf:_1 doc:50.16.1 . 

doc:50.16.1 a html:Text ; 
 html:fragment "c" .

doc:50.16 a html:DataCell ;
    rdf:_1 doc:50.16.1 .

doc:50.16.1 a html:Text ;
    html:fragment "c" .

Make note on how each element in the HTML-document is identified by a unique identifier, the IRI (Internationalized Resource Identifier). Now we can address each element, or combinations of elements, and say something about them. Either we express meaning (RDF, RDFS, OWL and more), or impose constraints (SHACL) or we can query (SPARQL) them to know more about them.

Example #2: an application GUI as HTML-document

<!DOCTYPE html>
<html>
    <head>
        <title>
        </title>
        <style>body {font-family: Arial, Helvetica, sans-serif; background-color: black; } * {box-sizing: border-box;} /* Add padding to containers */.container {padding: 16px; background-color: white;} /* Full-width input fields */ input[type=text], input[type=password] {width: 100%; padding: 15px;  margin: 5px 0 22px 0; display: inline-block; border: none;  background: #f1f1f1;} input[type=text]:focus, input[type=password]:focus {  background-color: #ddd;  outline: none;} /* Overwrite default styles of hr */ hr {   border: 1px solid #f1f1f1;   margin-bottom: 25px;} /* Set a style for the submit button */ .registerbtn {   background-color: #04AA6D;   color: white;   padding: 16px 20px;   margin: 8px 0;   border:  none;  cursor: pointer;   width: 100%;   opacity: 0.9; } .registerbtn:hover {  opacity: 1;} /* Add a blue text color to links */ a {  color: dodgerblue;} /* Set a grey background color and center the text of the "sign in" section */ .signin {  background-color: #f1f1f1;  text-align: center;}
        </style>
        <meta http-equiv="Content-Type" charset="utf-8">
    </head>
    <body>
        <form action="action_page.php">
            <div class="container">
                <h1>Register</h1>
                <p>Please fill in this form to create an account.</p>
                <hr>
                <label for="email">
                   <b>email</b>
                </label>
                <input id="email" name="email" placeholder="Enter Email" required="true" type="text">
                   <label for="psw">
                      <b>Password</b>
                   </label>
                 <input id="psw" name="psw" placeholder="Enter Password" required="true" type="password">
                    <label for="psw-repeat">
                       <b>Repeat Password</b>
                    </label>
                 <input id="psw-repeat" name="psw-repeat" placeholder="Repeat Password" required="true" type="password">
                 <hr>
                 <p>By creating an account you agree to our 
                   <a href="#">Terms & Privacy</a>.
                  </p>
                  <button class="registerbtn" type="submit">Register</button>
             </div>
             <div class="container signin">
                <p>Already have an account?
                    <a href="#">Sign in</a>.
                </p>
            </div>
        </form>
    </body>
</html>

This GUI with forms is rendered in a browser as follows:

An example of an HTML-document with forms

Again expressing the HTML-docu

Related Skills

View on GitHub
GitHub Stars26
CategoryDevelopment
Updated2mo ago
Forks6

Languages

HTML

Security Score

80/100

Audited on Jan 26, 2026

No findings