SkillAgentSearch skills...

PyVHDLParser

Streaming based VHDL parser.

Install / Use

/learn @Paebbels/PyVHDLParser
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Sourcecode on GitHub License GitHub tag (latest SemVer incl. pre-release) GitHub release (latest SemVer incl. including pre-releases) GitHub release date
GitHub Workflow Status PyPI PyPI - Status PyPI - Python Version Dependent repos (via libraries.io)
Libraries.io status for latest release Codacy - Quality Codacy - Coverage Codecov - Branch Coverage Libraries.io SourceRank
Read the Docs

pyVHDLParser

This is a token-stream based parser for VHDL-2008.

This project requires Python 3.8+.

Introduction

Main Goals

  • Parsing
    • slice an input document into tokens and text blocks which are categorized
    • preserve case, whitespace and comments
    • recover on parsing errors
    • good error reporting / throw exceptions
  • Fast Processing
    • multi-pass parsing and analysis
    • delay analysis if not needed at current pass
    • link tokens and blocks for fast-forward scanning
  • Generic VHDL Language Model
    • Assemble a document-object-model (Code-DOM)
    • Provide an API for code introspection

Use Cases

  • generate documentation by using the fast-forward scanner
  • generate a document/language model by using the grouped text-block scanner
  • extract compile orders and other dependency graphs
  • generate highlighted syntax
  • re-annotate documenting comments to their objects for doc extraction

Parsing approach

  1. slice an input document into tokens
  2. assemble tokens to text blocks which are categorized
  3. assemble text blocks for fast-forward scanning into groups
  4. translate groups into a document-object-model (DOM)
  5. provide a generic VHDL language model

Long time goals

  • A Sphinx language plugin for VHDL

    TODO: Move the following documentation to ReadTheDocs and replace it with a more lightweight version.

Basic Concept

Example 1

This is an input file:

-- Copryright 2016
library IEEE;
use     IEEE.std_logic_1164.all;

entity myEntity is
  generic (
    BITS : positive := 8
  );
  port (
    Clock   : in   std_logic;
    Output  : out  std_logic_vector(BITS - 1 downto 0)
  );
end entity;

architecture rtl of myEntity is
  constant const0 : integer := 5;
begin
  process(Clock)
  begin
  end process;
end architecture;

library IEEE, PoC;
use     PoC.Utils.all, PoC.Common.all;

package pkg0 is
  function func0(a : integer) return string;
end package;

package body Components is
  function func0(a : integer) return string is
    procedure proc0 is
    begin
    end procedure;
  begin
  end function
end package body;

Step 1

The input file (stream of characters) is translated into stream of basic tokens:

  • StartOfDocumentToken
  • LinebreakToken
  • SpaceToken
    • IndentationToken
  • WordToken
  • CharacterToken
    • FusedCharacterToken
  • CommentToken
    • SingleLineCommentToken
    • MultiLineCommentToken
  • EndOfDocumentToken

The stream looks like this:

<StartOfDocumentToken>
<SLCommentToken '-- Copryright 2016\n'  ................ at 1:1>
<WordToken      'library'  ............................. at 2:1>
<SpaceToken     ' '  ................................... at 2:8>
<WordToken      'IEEE'  ................................ at 2:9>
<CharacterToken ';'  ................................... at 2:13>
<LinebreakToken ---------------------------------------- at 2:14>
<WordToken      'use'  ................................. at 3:1>
<SpaceToken     '     '  ............................... at 3:4>
<WordToken      'IEEE'  ................................ at 3:9>
<CharacterToken '.'  ................................... at 3:13>
<WordToken      'std_logic_1164'  ...................... at 3:14>
<CharacterToken '.'  ................................... at 3:28>
<WordToken      'all'  ................................. at 3:29>
<CharacterToken ';'  ................................... at 3:32>
<LinebreakToken ---------------------------------------- at 3:33>
<LinebreakToken ---------------------------------------- at 4:1>
<WordToken      'entity'  .............................. at 5:1>
<SpaceToken     ' '  ................................... at 5:7>
<WordToken      'myEntity'  ............................ at 5:8>
<SpaceToken     ' '  ................................... at 5:16>
<WordToken      'is'  .................................. at 5:17>
<LinebreakToken ---------------------------------------- at 5:19>
<IndentToken    '\t'  .................................. at 6:1>
<WordToken      'generic'  ............................. at 6:2>
<SpaceToken     ' '  ................................... at 6:9>
<CharacterToken '('  ................................... at 6:10>
<LinebreakToken ---------------------------------------- at 6:11>
<IndentToken    '\t\t'  ................................ at 7:1>
<WordToken      'BITS'  ................................ at 7:3>
<SpaceToken     ' '  ................................... at 7:7>
<CharacterToken ':'  ................................... at 7:8>
<SpaceToken     ' '  ................................... at 7:8>
<WordToken      'positive'  ............................ at 7:10>
<SpaceToken     ' '  ................................... at 7:18>
<FusedCharToken ':='  .................................. at 7:19>
<SpaceToken     ' '  ................................... at 7:21>
<WordToken      '8'  ................................... at 7:22>
<LinebreakToken ---------------------------------------- at 7:23>
<IndentToken    '\t'  .................................. at 8:1>
<CharacterToken ')'  ................................... at 8:2>
<CharacterToken ';'  ................................... at 8:3>
<LinebreakToken ---------------------------------------- at 8:4>
<IndentToken    '\t'  .................................. at 9:1>
<WordToken      'port'  ................................ at 9:2>
<SpaceToken     ' '  ................................... at 9:6>
<CharacterToken '('  ................................... at 9:7>
<LinebreakToken ---------------------------------------- at 9:8>
<IndentToken    '\t\t'  ................................ at 10:1>
<WordToken      'Clock'  ............................... at 10:3>
<SpaceToken     '   '  ................................. at 10:8>
<CharacterToken ':'  ................................... at 10:11>
<SpaceToken     ' '  ................................... at 10:11>
<WordToken      'in'  .................................. at 10:13>
<SpaceToken     '  '  .................................. at 10:15>
<WordToken      'std_logic'  ........................... at 10:17>
<CharacterToken ';'  ................................... at 10:26>
<LinebreakToken ---------------------------------------- at 10:27>
<IndentToken    '\t\t'  ................................ at 11:1>
<WordToken      'Output'  .............................. at 11:3>
<SpaceToken     '       '  ................................... at 11:9>
<CharacterToken ':'  ................................... at 11:10>
<SpaceToken     ' '  ................................... at 11:10>
<WordToken      'out'  ................................. at 11:12>
<SpaceToken     '       '  ................................... at 11:15>
<WordToken      'std_logic_vector'  .................... at 11:16>
<CharacterToken '('  ................................... at 11:32>
<WordToken      'BITS'  ................................ at 11:33>
<SpaceToken     ' '  ................................... at 11:37>
<CharacterToken '-'  ................................... at 11:38>
<SpaceToken     ' '  ................................... at 11:38>
<WordToken      '1'  ................................... at 11:40>
<SpaceToken     ' '  ................................... at 11:41>
<WordToken      'downto'  .............................. at 11:42>
<SpaceToken     ' '  ................................... at 11:48>
<WordT
View on GitHub
GitHub Stars85
CategoryDevelopment
Updated8h ago
Forks15

Languages

Python

Security Score

85/100

Audited on Apr 2, 2026

No findings