Pydialect
Build languages on Python.
Install / Use
/learn @Technologicat/PydialectREADME
Pydialect: build languages on Python
NOTE April 2021: This project is obsolete. This technology now lives on in mcpyrate, and the example dialects can be found in unpythonic.
from __lang__ import lispython
def fact(n):
def f(k, acc):
if k == 1:
return acc
f(k - 1, k*acc)
f(n, acc=1)
assert fact(4) == 24
print(fact(5000))
Pydialect makes Python into a language platform, à la Racket. It provides the plumbing that allows to create, in Python, dialects that compile into Python at import time. Pydialect is geared toward creating languages that extend Python and look almost like Python, but extend or modify its syntax and/or semantics. Hence dialects.
As examples, we currently provide the following dialects:
- Lispython: Python with tail-call optimization (TCO), implicit return, multi-expression lambdas
- Pytkell: Python with automatic currying and lazy functions
- LisThEll: Python with prefix syntax and automatic currying
All three dialects support unpythonic's
continuations block macro (to add call/cc to the language), but do not enable it automatically.
Lispython aims at production quality; the others are intended just for testing.
Pydialect itself is only a lightweight infrastructure hook that makes
it convenient to define and use dialects. To implement the actual semantics
for your dialect (which is where all the interesting things happen), you may
want to look at MacroPy. Examples can be
found in unpythonic; see especially
the macros. On packaging a set of semantics into a dialect, look at the example
dialects; all three are thin wrappers around unpythonic.
Note what Pydialect does is similar to the rejected PEP 511,
but via import hooks, as indeed suggested in the rejection notice. Thus, beside dialects proper,
it is possible to use Pydialect to hook in a custom AST optimizer, by defining a dialect whose
ast_transformer is actually an optimizer. For ideas, see here.
Some possibilities are e.g. constant folding, hoisting, if optimization and loop unrolling.
Why dialects?
An extension to the Python language doesn't need to make it into the Python core, or even be desirable for inclusion into the Python core, in order to be useful.
Building on functions and syntactic macros, customization of the language itself is one more tool for the programmer to extract patterns, at a higher level. Hence, beside language experimentation, such extensions can be used as a framework that allows shorter and/or more readable programs.
Pydialect places language-creation power in the hands of its users, without the need to go to extreme lengths to hack CPython itself or implement from scratch a custom language that compiles to Python AST or bytecode.
Pydialect dialects compile to Python and are implemented in Python, allowing the rest of the user program to benefit from new versions of Python, mostly orthogonally to the development of any dialect.
At its simplest, a custom dialect can alleviate the need to spam a combination
of block macros in every module of a project that uses a macro-based language
extension, such as unpythonic.syntax. Being named as a dialect, a particular
combination of macros becomes instantly recognizable,
and DRY:
the dialect definition becomes the only place in the codebase that defines
the macro combination to be used by each module in the project.
The same argument applies to custom builtins: any functions or macros that feel like they "should be" part of the language layer, so that they won't have to be explicitly imported in each module where they are used.
Using dialects
Place a lang-import at the start of your module that uses a dialect:
from __lang__ import piethon
Run your program (in this example written in the piethon dialect)
through the pydialect bootstrapper instead of python3 directly,
so that the main program gets imported instead of run directly, to trigger
the import hook that performs the dialect processing. (Installing Pydialect
will install the bootstrapper.)
Any imported module that has a lang-import will be detected, and the appropriate dialect module (if and when found) will be invoked. The result is then sent to the macro expander (if MacroPy is installed and the code uses macros at that point), after which the final result is imported normally.
The lang-import must appear as the first statement of the module; only the module docstring is allowed to appear before it. This is to make it explicit that a dialect applies to the whole module. (Local changes to semantics are better represented as a block macro.)
At import time, the dialect importer replaces the lang-import with an
assignment that sets the module's __lang__ attribute to the dialect name,
for introspection. If a module does not have a __lang__ attribute at
run time, then it was not compiled by Pydialect. Note that just like with
MacroPy, at run time the code is pure Python.
The lang-import is a construct specific to Pydialect. This ensures that the
module will immediately fail if run under standard Python, because there is
no actual module named __lang__.
If you use MacroPy, the Pydialect import hook must be installed at index 0
in sys.meta_path, so that the dialect importer triggers before MacroPy's
standard macro expander. The pydialect bootstrapper takes care of this.
If you need to enable Pydialect manually for some reason, the incantation
to install the hook is import dialects.activate.
The lang-import syntax was chosen as a close pythonic equivalent to Racket's
#lang foo.
Defining a dialect
In Pydialect, a dialect is any module that provides one or both of the following callables:
-
source_transformer: source text -> source textThe full source code of the module being imported (including the lang-import) is sent to the the source transformer. The data type is whatever the loader's
get_sourcereturns, usuallystr.Source transformers are useful e.g. for defining custom infix operators. For example, the monadic bind syntax
a >>= bcould be made to transform into the syntaxa.__mbind__(b).Although the input is text, in practice a token-based approach is recommended; see stdlib's
tokenizemodule as a base to work from. (Be sure to untokenize when done, because the next stage expects text.)After the source transformer, the source text must be valid surface syntax for standard Python, i.e. valid input for
ast.parse. -
ast_transformer:listof AST nodes ->listof AST nodesAfter the source transformer, but before macro expansion, the full AST of the module being imported (minus the module docstring and the lang-import) is sent to this whole-module AST transformer.
This allows injecting implicit imports to create builtins for the dialect, as well as e.g. lifting the whole module (except the docstring and the code to set
__lang__) into awithblock to apply some MacroPy block macro(s) to the whole module.After the AST transformer, the module is sent to MacroPy for macro expansion (if MacroPy is installed, and the module has macros at that point), and after that, the result is finally imported normally.
The AST transformer can use MacroPy if it wants, but doesn't have to; this decision is left up to each developer implementing a dialect.
If you make an AST transformer, and have MacroPy, then see dialects.util,
which can help with the boilerplate task of pasting in the code from the
user module (while handling macro-imports correctly in both the dialect
template and in the user module).
The name of a dialect is simply the name of the module or package that implements the dialect. In other words, it's the name that needs to be imported to find the transformer functions.
Note that a dotted name in place of the xxx in from __lang__ import xxx
is not valid Python syntax, so (currently) a dialect should be defined in a
top-level module (no dots in the name). Strictly, the dialect finder doesn't
need to care about this (though it currently does), but IDEs and tools in
general are much happier with code that does not contain syntax errors.
(This allows using standard Python tools with dialects that do not introduce
any new surface syntax.)
A dialect can be implemented using another dialect, as long as there are no dependency loops. Whenever a lang-import is detected, the dialect importer is invoked (especially, also during the import of a module that defines a new dialect). This allows creating a tower of languages.
Combining existing dialects
Dangerous things should be difficult to do by accident. --John Shutt
Due to the potentially unlimited complexity of interactions between language features defined by different dialects, there is by design no automation for combining dialects. In the general case, this is something that requires human intervention.
If you know (or at least suspect) that two or more dialects are compatible,
you can define a new dialect whose source_transformer and ast_transformer
simply chain those of the existing dialects (in the desired order; consider how
the macros expand), and then use
Related Skills
node-connect
341.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
claude-opus-4-5-migration
84.5kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
frontend-design
84.5kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
model-usage
341.2kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
