Tstl

Template Scripting Testing Language tool: automated test generation for Python

Generate Convert Improve

Install / Use

/learn @agroce/Tstl

About this skill

Quality Score

0/100

README

TSTL: the Template Scripting Testing Language

TSTL is a domain-specific language (DSL) and set of tools to support automated generation of tests for software. This implementation targets Python. You define (in Python) a set of components used to build up a test, and any properties you want to hold for the tested system, and TSTL generates tests for your system. TSTL supports test replay, test reduction, and code coverage analysis, and includes push-button support for some sophisticated test-generation methods. In other words, TSTL is a property-based testing tool.

What is property based testing? Property-based testing is testing that relies not on developers specifying results for specific inputs or call sequences, but on more general specification of behavior, combined with automatic generation of many tests to make sure that the general specification holds. For more on property-based testing see:

https://fsharpforfunandprofit.com/posts/property-based-testing/
https://hypothesis.works/articles/what-is-property-based-testing/
https://github.com/trailofbits/deepstate (a tool mixing symbolic analysis and fuzzing with property-based testing, for C and C++, with design somewhat informed by TSTL)

TSTL has been used to find and fix real faults in real code, including ESRI's ArcPy (http://desktop.arcgis.com/en/arcmap/latest/analyze/arcpy/what-is-arcpy-.htm), sortedcontainers (https://github.com/grantjenks/sorted_containers), gmpy2 (https://github.com/aleaxit/gmpy), sympy (http://www.sympy.org/en/index.html), pyfakefs (https://github.com/jmcgeheeiv/pyfakefs), Python itself (https://bugs.python.org/issue27870), the Solidity compiler (https://github.com/ethereum/solidity), a Solidity static analysis tool (https://github.com/crytic/slither), the Vyper compiler (e.g. https://github.com/ethereum/vyper/issues/1658), and even OS X.

Installation

You can grab a recent tstl most easily using pip. pip install tstl should work fine. If you want something even more recent you can do:

git clone https://github.com/agroce/tstl.git
cd tstl
python setup.py install

For code coverage, you will also need to install Ned Batchelder's coverage.py tool; pip install coverage is all that is needed.

TSTL in a Nutshell

To get an idea of how TSTL operates, let's try a toy example. We will use TSTL to solve a simple "puzzle" to see if it is possible to generate the integer value 510 using only a few lines of Python code, using only a small set of operations (add 4, subtract 3, multiply by 3, and produce a power of two) starting from 0.

Create a file called nutshell.tstl with the following content:

@import math

# A line beginning with an @ is just python code.

pool: <int> 5

# A pool is a set of values we'll produce and use in testing.
# We need some integers, and we'll let TSTL produce up to 5 of them.
# The name is a variable name, basically, but often will be like a
# type name, showing how the value is used.

<int> := 0
<int> += 4
<int> -= 3
<int> *= 3
{OverflowError} <int> := int(math.pow(2,<int>))

# These are actions, basically single lines of Python code.
# The big changes from normal Python are:
# 1. := is like Python assignment with =, but also tells TSTL this
# assignment _initializes_ a value.
# 2. <int> is a placeholder meaning _any_ int value in the pool.
# 3. {OverflowError} means that we want to ignore if this line of
# Python produces an uncaught OverflowError exception.

# A test in TSTL is a sequence of actions.  So, given the above, one
# test would be:
#
# int3 = 0
# int4 = 0
# int3 *= 3
# int4 += 4
# int3 = 0
# int2 = int(math.pow(2,int4))
# int2 -= 3

# As you can see, the actions can appear in any order, but every
# pool variable is first initialized by some := assignment.
# Similarly, TSTL may use pool variables in an arbitrary order;
# thus we never see int0 or int1 used, here, by chance.

# The size of the int pool determines how many different ints can
# appear in such a test.  You can think of it as TSTL's "working
# memory."  If you have a pool size of 1, and an action like
# foo(<int>,<int>) you'll always call foo with the same value for both
# parameters -- like foo(int0,int0).  You should always have a pool
# size at least as large as the number of times you use a pool in a
# single action.  More is often better, to give TSTL more ability to
# bring back in earlier computed values.

property: <int> != 510

# property: expresses an invariant of what we are testing.  If the
# boolean expression evaluates to False, the test has failed.

As in normal Python, # indicates a comment. Comment lines are below the TSTL code being described.

Type tstl nutshell.tstl.
Type tstl_rt --normalize --output nutshell.test.

This should, in a few seconds, find a way to violate the property (produce the value 510), find a maximally-simple version of that "failing test", and produce a file nutshell.test that contains the test. If we had omitted the {OverflowError} TSTL would either have found a way to produce 510, or (less likely) would have found a way to produce an overflow in the pow call: either would be considered a failure.

Type tstl_replay nutshell.test --verbose.

This will replay the test you just created.

Comment out (using # as usual in Python code) the line <int> -= 3. Now try running tstl_rt.

The core idea of TSTL is to define a set of possible steps in a test, plus properties describing what can be considered a test failure, and let TSTL find out if there exists a sequence of actions that will produce a test failure. The actions may be function or method calls, or steps that assemble input data (for example, building up a string to pass to a parser), or, really, anything you can do with Python.

Using TSTL

TSTL installs a few standard tools: the TSTL compiler itself, tstl; a random test generator tstl_rt; a tool for producing standalone tests, tstl_standalone; a tool for replaying TSTL test files, tstl_replay; a tool for delta-debugging and normalization of TSTL tests, tstl_reduce; and a tool for running a set of tests as a regression, tstl_regress.

You can do most of what you'll need with just the commands tstl, tstl_rt, tstl_replay, and tstl_reduce.

tstl <filename.tstl> compiles a .tstl file into an sut.py interface for testing
tstl_rt runs random testing on the sut.py in the current directory, and dumps any discovered faults into .test files
tstl_replay <filename.test> runs a saved TSTL test, and tells you if it passes or fails; with --verbose it provides a fairly detailed trace of the test execution
tstl_reduce <filename.test> <newfilename.tstl> takes <filename.test> runs reduction and normalization on it to produce a shorter, easier to understand test, and saves the output as <newfilename.tstl>.

All of these tools offer a large number of configuration options; --help will produce a list of supported options for all TSTL tools.

Extended Example

The easiest way to understand TSTL may be to examine examples/AVL/avlnew.tstl (https://github.com/agroce/tstl/blob/master/examples/AVL/avlnew.tstl), which is a simple example file in the latest language format.

avlnew.tstl creates a pretty full-featured tester for an AVL tree class. You can write something very quick and fairly effective with just a few lines of code, however:

@import avl
pool: <int> 3
pool: <avl> 2

property: <avl>.check_balanced()

<int> := <[1..20]>
<avl> := avl.AVLTree()

<avl>.insert(<int>)
<avl>.delete(<int>)
<avl>.find(<int>)
<avl>.display()

This says that there are two kinds of "things" involved in our AVL tree implementation testing: int and avl. We define, in Python, how to create these things, and what we can do with these things, and then TSTL produces sequences of actions, that is tests, that match our definition. TSTL also checks that all AVL trees, at all times, are properly balanced. If we wanted, as in avlnew.tstl, we could also make sure that our AVL tree "acts like" a set --- when we insert something, we can find that thing, and when we delete something, we can no longer find it.

Note that we start with "raw Python" to import the avl module, the SUT. While TSTL supports using from, aliases, and wildcards in imports, you should always import the module(s) under test with a simple import. This allows TSTL to identify the code to be tested and automatically provide coverage, static analysis-aided testing methods, and proper module management. Utility code in the standard library, on the other hand, can be imported any way you wish.

If we test this (or avlnew.tstl) for 30 seconds, something like this will appear:

~/tstl/examples/AVL$ tstl_rt --timeout 30

Random testing using config=Config(swarmSwitch=None, verbose=False, fastQuickAnalysis=False, failedLogging=None, maxtests=-1, greedyStutter=False, exploit=None, seed=None, generalize=False, localize=False, uncaught=False, speed='FAST', internal=False, normalize=False, highLowSwarm=None, replayable=False, essentials=False, quickTests=False, coverfile='coverage.out', uniqueValuesAnalysis=False, swarm=False, ignoreprops=False, total=False, swarmLength=None, noreassign=False, profile=False, full=False, multiple=False, relax=False, swarmP=0.5, stutter=None, running=False, compareFails=False, nocover=False, swarmProbs=None, gendepth=None, quickAnalysis=False, exploitCeiling=0.1, logging=None, html=None, keep=False, depth=100, throughput=False, timeout=30, output=None, markov=None, startExploit=0)
  12 [2:0]
-- < 2 [1:0]
---- < 1 [0:0] L
---- > 5 [0:0] L
-- > 13 [1:-1]
---- > 14 [0:0] L
set([1, 2, 5, 12, 13, 14])
...
  11 [2:0]
-- < 5 [1:0]
---- < 1 [0:0] L
---- > 9 [0:0] L
-- > 14 [1:-1]
---- > 18 [0:0] L
set([1, 5, 9, 11, 14, 18

Related Skills

gh-issues

335.2k

Fetch GitHub issues, spawn sub-agents to implement fixes and open PRs, then monitor and address PR review comments. Usage: /gh-issues [owner/repo] [--label bug] [--limit 5] [--milestone v1.0] [--assignee @me] [--fork user/repo] [--watch] [--interval 5] [--reviews-only] [--cron] [--dry-run] [--model glm-5] [--notify-channel -1002381931352]

node-connect

335.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

claude-opus-4-5-migration

82.5k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

frontend-design

82.5k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

agroce

View profile

View on GitHub

GitHub Stars111

CategoryDevelopment

Updated27d ago

Forks25

agroce/tstl

Languages

Python

Security Score

85/100

Audited on Feb 26, 2026

No findings