Peepmatic

A DSL and compiler for generating peephole optimizers for Cranelift

Generate Convert Improve

Install / Use

/learn @fitzgen/Peepmatic

About this skill

Quality Score

0/100

README

<div align="center"> <h1> <a href="https://github.com/bytecodealliance/wasmtime/pull/1647"><code>peepmatic</code> has moved into Cranelift!</a> </h1> <p>This repository is archived now.</p> </div>

<div align="center"> <h1><code>peepmatic</code></h1> <p> <b> <code>peepmatic</code> is a DSL and compiler for peephole optimizers for <a href="https://github.com/bytecodealliance/wasmtime/tree/master/cranelift#readme">Cranelift</a>. </b> </p> <img src="https://github.com/fitzgen/peepmatic/workflows/Rust/badge.svg"/> </div>

About
Example
A DSL for Optimizations
Implementation
References
Acknowledgments

About

Peepmatic is a DSL for peephole optimizations and compiler for generating peephole optimizers from them. The user writes a set of optimizations in the DSL, and then peepmatic compiles the set of optimizations into an efficient peephole optimizer:

DSL ----peepmatic----> Peephole Optimizer

The generated peephole optimizer has all of its optimizations' left-hand sides collapsed into a compact automata that makes matching candidate instruction sequences fast.

The DSL's optimizations may be written by hand or discovered mechanically with a superoptimizer like Souper. Eventually, peepmatic should have a verifier that ensures that the DSL's optimizations are sound, similar to what Alive does for LLVM optimizations.

Currently, peepmatic is targeting peephole optimizers that operate on Cranelift's clif intermediate representation. The intended next target is Cranelift's new backend's "vcode" intermediate representation. Supporting non-Cranelift targets is not a goal.

Example

This snippet of our DSL describes optimizations for removing redundant bitwise-or instructions that are no-ops:

(=> (bor $x (bor $x $y))
    (bor $x $y))

(=> (bor $y (bor $x $y))
    (bor $x $y))

(=> (bor (bor $x $y) $x)
    (bor $x $y))

(=> (bor (bor $x $y) $y)
    (bor $x $y))

When compiled into a peephole optimizer automaton, they look like this:

A DSL for Optimizations

A single peephole optimization has two parts:

A left-hand side that describes candidate instruction sequences that the optimization applies to.
A right-hand side that contains the new instruction sequence that replaces old instruction sequences that the left-hand side matched.

A left-hand side may bind sub-expressions to variables and the right-hand side may contain those bound variables to reuse the sub-expressions. The operations inside the left-hand and right-hand sides are a subset of clif operations.

Let's take a look at an example:

(=> (imul $x 2)
    (ishl $x 1))

As you can see, the DSL uses S-expressions. (S-expressions are easy to parse and we also have a bunch of nice parsing infrastructure for S-expressions already for our wat and wast crates.)

The left-hand side of this optimization is (imul $x 2). It matches integer multiplication operations where a value is multiplied by the constant two. The value multiplied by two is bound to the variable $x.

The right-hand side of this optimization is (ishl $x 1). It reuses the $x variable that was bound in the left-hand side.

This optimization replaces expressions of the form x * 2 with x << 1. This is sound because multiplication by two is the same as shifting left by one for binary integers, and it is desirable because a shift-left instruction executes in fewer cycles than a multiplication.

The general form of an optimization is:

(=> <left-hand-side> <right-hand-side>)

Variables

Variables begin with a dollar sign and are followed by lowercase letters, numbers, hyphens, and underscores: $x, $y, $my-var, $operand2.

Left-hand side patterns may contain variables that match any kind of sub-expression and give it a name so that it may be reused in the right-hand side.

;; Replace `x + 0` with simply `x`.
(=> (iadd $x 0)
    $x)

Within a pattern, every occurrence of a variable with the same name must match the same value. That is (iadd $x $x) matches (iadd 1 1) but does not match (iadd 1 2). This lets us write optimizations such as this:

;; Xor'ing a value with itself is always zero.
(=> (bxor $x $x)
    (iconst 0))

Constants

We've already seen specific integer literals and wildcard variables in patterns, but we can also match any constant. These are written similar to variables, but use uppercase letters rather than lowercase: $C, $MY-CONST, and $OPERAND2.

For example, we can use constant patterns to combine an iconst and iadd into a single iadd_imm instruction:

(=> (iadd (iconst $C) $x)
    (iadd_imm $C $x))

Nested Patterns

Patterns can also match nested operations with their own nesting:

(=> (bor $x (bor $x $y))
    (bor $x $y))

Preconditions and Unquoting

Let's reconsider our first example optimization:

(=> (imul $x 2)
    (ishl $x 1))

This optimization is a little too specific. Here is another version of this optimization that we'd like to support:

(=> (imul $x 4)
    (ishl $x 2))

We don't want to have to write out all instances of this general class of optimizations! That would be a lot of repetition and could also bloat the size of our generated peephole optimizer's matching automata.

Instead, we can generalize this optimization by matching any multiplication by a power of two constant C and replacing it with a shift left of log2(C).

First, rather than match 2 directly, we want to match any constant variable C:

(imul $x $C)

Note that variables begin with lowercase letters, while constants begin with uppercase letters. Both the constant pattern $C and variable pattern $x will match 5, but only the variable pattern $x will match a whole sub-expression like (iadd 1 2). The constant pattern $C only matches constant values.

Next, we augment our left-hand side's pattern with a precondition that the constant $C must be a power of two. Preconditions are introduced by wrapping a pattern in the when form:

;; Our new left-hand side, augmenting a pattern with a precondition!
(when
  ;; The pattern matching multiplication by a constant value.
  (imul $x $C)

  ;; The precondition that $C must be a power of two.
  (is-power-of-two $C))

In the right-hand side, we use unquoting to perform compile-time evaluation of log2($C). Unquoting is done with the $(...) form:

;; Our new right-hand side, using unqouting to do compile-time evaluation of
;; constants that were matched and bound in the left-hand side!
(ishl $x $(log2 $C))

Finally, here is the general optimization putting our new left-hand and right-hand sides together:

(=> (when (imul $x $C)
          (is-power-of-two $C))
    (ishl $x $(log2 $C)))

Bit Widths

Similar to how Cranelift's instructions are bit-width polymorphic, peepmatic optimizations are also bit-width polymorphic. Unless otherwise specified, a pattern will match expressions manipulating i32s just the same as expressions manipulating i64s, etc... An optimization that doesn't constrain its pattern's bit widths must be valid for all bit widths:

To constrain an optimization to only match i32s, for example, you can use the bit-width precondition:

(=> (when (iadd $x $y)
          (bit-width $x 32)
          (bit-width $y 32))
    ...)

Alternatively, you can ascribe a type to an operation by putting the type inside curly brackets after the operator, like this:

(=> (when (sextend{i64} (ireduce{i32} $x))
          (bit-width $x 64))
    (sshr (ishl $x 32) 32))

Implementation

Peepmatic has roughly four phases:

Parsing
Type Checking
Linearization
Automatization

(I say "roughly" because there are a couple micro-passes that happen after linearization and before automatization. But those are the four main phases.)

Parsing

Parsing transforms the DSL source text into an abstract syntax tree (AST).

We use the wast crate. It gives us nicely formatted errors with source context, as well as some other generally nice-to-have parsing infrastructure.

Relevant source files:

src/parser.rs
src/ast.rs

Type Checking

Type checking operates on the AST. It checks that types and bit widths in the optimizations are all valid. For example, it ensures that the type and bit width of an optimization's left-hand side is the same as its right-hand side, because it doesn't make sense to replace an integer expression with a boolean expression.

After type checking is complete, certain AST nodes are assigned a type and bit width, that are later used in linearization and when matching and applying optimizations.

We walk the AST and gather type constraints. Every constraint is assoc

Related Skills

node-connect

346.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

107.2k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

346.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

346.4k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。