Pydialect

Build languages on Python.

Generate Convert Improve

Install / Use

/learn @Technologicat/Pydialect

About this skill

Quality Score

0/100

README

Pydialect: build languages on Python

NOTE April 2021: This project is obsolete. This technology now lives on in mcpyrate, and the example dialects can be found in unpythonic.

from __lang__ import lispython

def fact(n):
    def f(k, acc):
        if k == 1:
            return acc
        f(k - 1, k*acc)
    f(n, acc=1)
assert fact(4) == 24
print(fact(5000))

Pydialect makes Python into a language platform, à la Racket. It provides the plumbing that allows to create, in Python, dialects that compile into Python at import time. Pydialect is geared toward creating languages that extend Python and look almost like Python, but extend or modify its syntax and/or semantics. Hence dialects.

As examples, we currently provide the following dialects:

All three dialects support unpythonic's continuations block macro (to add call/cc to the language), but do not enable it automatically. Lispython aims at production quality; the others are intended just for testing.

Pydialect itself is only a lightweight infrastructure hook that makes it convenient to define and use dialects. To implement the actual semantics for your dialect (which is where all the interesting things happen), you may want to look at MacroPy. Examples can be found in unpythonic; see especially the macros. On packaging a set of semantics into a dialect, look at the example dialects; all three are thin wrappers around unpythonic.

Note what Pydialect does is similar to the rejected PEP 511, but via import hooks, as indeed suggested in the rejection notice. Thus, beside dialects proper, it is possible to use Pydialect to hook in a custom AST optimizer, by defining a dialect whose ast_transformer is actually an optimizer. For ideas, see here. Some possibilities are e.g. constant folding, hoisting, if optimization and loop unrolling.

Why dialects?

An extension to the Python language doesn't need to make it into the Python core, or even be desirable for inclusion into the Python core, in order to be useful.

Building on functions and syntactic macros, customization of the language itself is one more tool for the programmer to extract patterns, at a higher level. Hence, beside language experimentation, such extensions can be used as a framework that allows shorter and/or more readable programs.

Pydialect places language-creation power in the hands of its users, without the need to go to extreme lengths to hack CPython itself or implement from scratch a custom language that compiles to Python AST or bytecode.

Pydialect dialects compile to Python and are implemented in Python, allowing the rest of the user program to benefit from new versions of Python, mostly orthogonally to the development of any dialect.

At its simplest, a custom dialect can alleviate the need to spam a combination of block macros in every module of a project that uses a macro-based language extension, such as unpythonic.syntax. Being named as a dialect, a particular combination of macros becomes instantly recognizable, and DRY: the dialect definition becomes the only place in the codebase that defines the macro combination to be used by each module in the project.

The same argument applies to custom builtins: any functions or macros that feel like they "should be" part of the language layer, so that they won't have to be explicitly imported in each module where they are used.

Using dialects

Place a lang-import at the start of your module that uses a dialect:

from __lang__ import piethon

Run your program (in this example written in the piethon dialect) through the pydialect bootstrapper instead of python3 directly, so that the main program gets imported instead of run directly, to trigger the import hook that performs the dialect processing. (Installing Pydialect will install the bootstrapper.)

Any imported module that has a lang-import will be detected, and the appropriate dialect module (if and when found) will be invoked. The result is then sent to the macro expander (if MacroPy is installed and the code uses macros at that point), after which the final result is imported normally.

The lang-import must appear as the first statement of the module; only the module docstring is allowed to appear before it. This is to make it explicit that a dialect applies to the whole module. (Local changes to semantics are better represented as a block macro.)

At import time, the dialect importer replaces the lang-import with an assignment that sets the module's __lang__ attribute to the dialect name, for introspection. If a module does not have a __lang__ attribute at run time, then it was not compiled by Pydialect. Note that just like with MacroPy, at run time the code is pure Python.

The lang-import is a construct specific to Pydialect. This ensures that the module will immediately fail if run under standard Python, because there is no actual module named __lang__.

If you use MacroPy, the Pydialect import hook must be installed at index 0 in sys.meta_path, so that the dialect importer triggers before MacroPy's standard macro expander. The pydialect bootstrapper takes care of this. If you need to enable Pydialect manually for some reason, the incantation to install the hook is import dialects.activate.

The lang-import syntax was chosen as a close pythonic equivalent to Racket's #lang foo.

Defining a dialect

In Pydialect, a dialect is any module that provides one or both of the following callables:

source_transformer: source text -> source text

The full source code of the module being imported (including the lang-import) is sent to the the source transformer. The data type is whatever the loader's get_source returns, usually str.

Source transformers are useful e.g. for defining custom infix operators. For example, the monadic bind syntax a >>= b could be made to transform into the syntax a.__mbind__(b).

Although the input is text, in practice a token-based approach is recommended; see stdlib's tokenize module as a base to work from. (Be sure to untokenize when done, because the next stage expects text.)

After the source transformer, the source text must be valid surface syntax for standard Python, i.e. valid input for ast.parse.
ast_transformer: list of AST nodes -> list of AST nodes

After the source transformer, but before macro expansion, the full AST of the module being imported (minus the module docstring and the lang-import) is sent to this whole-module AST transformer.

This allows injecting implicit imports to create builtins for the dialect, as well as e.g. lifting the whole module (except the docstring and the code to set __lang__) into a with block to apply some MacroPy block macro(s) to the whole module.

After the AST transformer, the module is sent to MacroPy for macro expansion (if MacroPy is installed, and the module has macros at that point), and after that, the result is finally imported normally.

The AST transformer can use MacroPy if it wants, but doesn't have to; this decision is left up to each developer implementing a dialect.

If you make an AST transformer, and have MacroPy, then see dialects.util, which can help with the boilerplate task of pasting in the code from the user module (while handling macro-imports correctly in both the dialect template and in the user module).

The name of a dialect is simply the name of the module or package that implements the dialect. In other words, it's the name that needs to be imported to find the transformer functions.

Note that a dotted name in place of the xxx in from __lang__ import xxx is not valid Python syntax, so (currently) a dialect should be defined in a top-level module (no dots in the name). Strictly, the dialect finder doesn't need to care about this (though it currently does), but IDEs and tools in general are much happier with code that does not contain syntax errors. (This allows using standard Python tools with dialects that do not introduce any new surface syntax.)

A dialect can be implemented using another dialect, as long as there are no dependency loops. Whenever a lang-import is detected, the dialect importer is invoked (especially, also during the import of a module that defines a new dialect). This allows creating a tower of languages.

Combining existing dialects

Dangerous things should be difficult to do by accident. --John Shutt

Due to the potentially unlimited complexity of interactions between language features defined by different dialects, there is by design no automation for combining dialects. In the general case, this is something that requires human intervention.

If you know (or at least suspect) that two or more dialects are compatible, you can define a new dialect whose source_transformer and ast_transformer simply chain those of the existing dialects (in the desired order; consider how the macros expand), and then use

Related Skills

node-connect

341.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

claude-opus-4-5-migration

84.5k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

frontend-design

84.5k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

model-usage

341.2k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

Technologicat

View profile

View on GitHub

GitHub Stars12

CategoryDevelopment

Updated2y ago

Forks0

Technologicat/pydialect

Languages

Python

Security Score

80/100

Audited on Jan 13, 2024

No findings