SkillAgentSearch skills...

Grammarinator

ANTLR v4 grammar-based test generator

Install / Use

/learn @renatahodovan/Grammarinator

README

============= Grammarinator

ANTLRv4 grammar-based test generator

.. image:: https://img.shields.io/pypi/v/grammarinator?logo=python&logoColor=white :target: https://pypi.org/project/grammarinator/ .. image:: https://img.shields.io/pypi/l/grammarinator?logo=open-source-initiative&logoColor=white :target: https://pypi.org/project/grammarinator/ .. image:: https://img.shields.io/github/actions/workflow/status/renatahodovan/grammarinator/main.yml?branch=master&logo=github&logoColor=white :target: https://github.com/renatahodovan/grammarinator/actions .. image:: https://img.shields.io/coveralls/github/renatahodovan/grammarinator/master?logo=coveralls&logoColor=white :target: https://coveralls.io/github/renatahodovan/grammarinator .. image:: https://img.shields.io/readthedocs/grammarinator?logo=read-the-docs&logoColor=white :target: http://grammarinator.readthedocs.io/en/latest/

.. start included documentation

Grammarinator is a random test generator / fuzzer that creates test cases according to an input ANTLR_ v4 grammar. The motivation behind this grammar-based approach is to leverage the large variety of publicly available ANTLR v4 grammars_. It includes both a Python-based and a high-performance C++ backend for generation.

.. _ANTLR: http://www.antlr.org .. _ANTLR v4 grammars: https://github.com/antlr/grammars-v4 .. _trophy page: https://github.com/renatahodovan/grammarinator/wiki

+--------------------------------------------------------------------------+ | TL;DR - KEY FEATURES | +--------------------------------------------------------------------------+ | Quick overview of the most important capabilities | +==========================================================================+ | | | * Generate test cases from scratch based on ANTLR v4 grammars_ or | | mutate/recombine existing test cases after they have been parsed. | | | | * Beside blackbox test generation, supports guided fuzzing through | | native integration with libFuzzer_ and AFL++_. | | | | * The AFL++ integration also enables grammar-aware test case | | minimization via the afl-tmin utility. | | | | * Grammar-aware mutation and recombination without slowing down the | | fuzzing with parsing (using pre-parsed input seeds). | | | | * Fine-grained probabilistic generation control via inline grammar | | weights or external JSON-based weight configurations (for alternatives | | and quantifiers). | | | | * Support for inline semantic predicates in grammars to dynamically | | enable or disable grammar alternatives during generation. | | | | * Multiple size-control strategies, including maximum recursion depth| | and maximum token count limits. | | | | * Built-in caching to filter out duplicate generated inputs. | | | | * Both grammar-aware and grammar-unaware mutators, with selective | | enablement and disabling support. | | | | * Extensible serialization pipeline with custom serializers for | | formatting tree-based outputs into concrete test inputs. | | | | * Advanced customization hooks: | | | | * custom models for programmatic decision guidance | | * custom listeners for information collection during generation | | * custom transformers for post-generation tree transformations | +--------------------------------------------------------------------------+

.. _libFuzzer: https://llvm.org/docs/LibFuzzer.html .. _AFL++: https://aflplus.plus

Requirements

  • Python_ >= 3.10
  • Java_ SE >= 11 JRE or JDK (the latter is optional)

Additionally, for the C++ backend:

  • C++20 compiler (e.g., GCC >= 11.0, Clang >= 13.0, MSVC >= 2019)
  • CMake_ >= 3.10

.. _Python: https://www.python.org .. _Java: https://www.oracle.com/java/ .. _CMake: https://cmake.org

Install

To use Grammarinator in another project, it can be added to setup.cfg as an install requirement (if using setuptools_ with declarative config):

.. code-block:: ini

[options]
install_requires =
    grammarinator

To install Grammarinator manually, e.g., into a virtual environment, use pip_::

pip install grammarinator

The above approaches install the latest release of Grammarinator from PyPI_. Alternatively, for the development version, clone the project and perform a local install::

pip install .

.. _setuptools: https://github.com/pypa/setuptools .. _pip: https://pip.pypa.io .. _PyPI: https://pypi.org/

Usage

As a first step, Grammarinator takes an ANTLR v4 grammar_ and creates a test generator script in Python3 or in C++. Grammarinator supports a subset of the features of the ANTLR grammar which is introduced in the Grammar overview section of the documentation. The produced generator can be subclassed later to customize it further if needed.

Basic command-line syntax of test generator creation (Python or C++)::

grammarinator-process <grammar-file(s)> -o <output-directory> --no-actions [--language hpp]

..

**Notes**

*Grammarinator* uses the `ANTLR v4 grammar`_ format as its input, which
makes existing grammars (lexer and parser rules) easily reusable. However,
because of the inherently different goals of a fuzzer and a parser, inlined
code (actions and conditions, header and members blocks) are most probably
not reusable, or even preventing proper execution. For first experiments
with existing grammar files, ``grammarinator-process`` supports the
command-line option ``--no-actions``, which skips all such code blocks
during fuzzer generation. Once inlined code is tuned for fuzzing, that
option may be omitted.

.. _ANTLR v4 grammar: https://github.com/antlr/grammars-v4

Python-based Test Generation

After having generated and optionally customized a fuzzer, it can be executed by the grammarinator-generate script (or by manually instantiating it in a custom-written driver, of course).

Basic command-line syntax of grammarinator-generate::

grammarinator-generate <generator> \
  -r <start-rule> -d <max-depth> \
  -o <output-pattern> -n <number-of-tests> \
  -t <transformer1> -t <transformer2>

C++-based Test Generation

After generating the C++-based fuzzer using grammarinator-process with the --language hpp flag, it needs to be built::

python3 grammarinator-cxx/dev/build.py --clean \
    --generator <generator> \
    --includedir <include-dir> \
    --tools

Once built, the standalone generator can be run as follows::

grammarinator-cxx/build/Release/bin/grammarinator-generate-<name> \
    -r <start-rule> -d <max-depth> \
    -o <output-pattern> -n <number-of-tests>

Note: The C++ backend can also be used as a custom mutator with libFuzzer. Details about this are provided in the LibFuzzer Integration section of the documentation.

Evolutionary Generation

Beside generating test cases from scratch based on the ANTLR grammar, Grammarinator is also able to recombine existing inputs or mutate only a small portion of them. To use these additional generation approaches, a population of selected test cases has to be prepared. The preparation happens with the grammarinator-parse tool, which processes the input files with an ANTLR grammar (possibly with the same one as the generator grammar) and builds grammarinator tree representations from them (with .grt* extension). These files encode the full derivation tree of the input, and can be reused across different fuzzing strategies.

Basic command line syntax of grammarinator-parse::

grammarinator-parse -g <grammar-file(s)> -r <start-rule>
-o <output-directory> <input_file(s)>

Having a population of such .grt* files, grammarinator-generate or grammarinator-generate-<name> can make use of them with the --population CLI option. If the --population option is set (for the Python or C++ generator), then Grammarinator will choose a strategy (generation, mutation, or recombination) randomly for each new test case. If any of the strategies is unwanted, they can be disabled with the --no-generate, --no-mutate, or --no-recombine options.

..

**Notes**

Real-life grammars often use recursive rules to express certain patterns.
However, when using such rule(s) for generation, we can easily end up in an
unexpectedly deep call stack. With the ``--max-depth`` or ``-d`` options,
this depth - and also the size of the generated test cases - can be
controlled.

Another specialty of the ANTLR grammars is that they support so-called
hidden tokens. These rules typi
View on GitHub
GitHub Stars421
CategoryDevelopment
Updated7d ago
Forks67

Languages

Python

Security Score

85/100

Audited on Mar 18, 2026

No findings