MarkovJunior
Probabilistic language based on pattern matching and constraint propagation, 153 examples
Install / Use
/learn @mxgmn/MarkovJuniorREADME
MarkovJunior
MarkovJunior is a probabilistic programming language where programs are combinations of rewrite rules and inference is performed via constraint propagation. MarkovJunior is named after mathematician Andrey Andreyevich Markov, who defined and studied what is now called Markov algorithms.
<p align="center"> <img src="images/top-iso.gif"/> <img src="images/top-mv.gif"/> </p>In its basic form, a MarkovJunior program is an ordered list of rewrite rules. For example, MazeBacktracker (animation on the left below) is a list of 2 rewrite rules:
RBB=GGRor "replace red-black-black with green-green-red".RGG=WWRor "replace red-green-green with white-white-red".
On each execution step MJ interpreter finds the first rule in the list that has a match on the grid, finds all matches for that rule and applies that rule for a random match. In the maze backtracker example, interpreter first applies a bunch of RBB=GGR rules. But eventually the green self-avoiding walk gets stuck. At this point the first rule has no matches, so interpreter applies the second rule RGG=WWR until the walk gets unstuck. Then it can apply the first rule again, and so on. Interpreter stops when there are no matches for any rule.
Probabilistic inference in MarkovJunior allows to impose constraints on the future state, and generate only those runs that lead to the constrained future. For example, inference in Sokoban rules {RWB=BRW RB=BR} makes a group of (red) agents organize (white) crates into specified shapes.
Using these ideas, we construct many probabilistic generators of dungeons, architecture, puzzles and fun simulations.
<p align="center"><a href="images/top-1764.png"/><img src="images/top-882.png"/></a></p>Additional materials:
- Xml syntax overview.
- Higher resolution screenshots and more seeds: ModernHouse, SeaVilla, Apartemazements, CarmaTower, Escheresque, PillarsOfEternity, Surface, Knots.
- Unofficial technical notes by Dan Ogles and code documentation by Andrew Kay.
Markov algorithms
A Markov algorithm over an alphabet A is an ordered list of rules. Each rule is a string of the form x=y, where x and y are words in A, and some rules may be marked as halt rules. Application of a Markov algorithm to a word w proceeds as follows:
- Find the first rule
x=ywherexis a substring ofw. If there are no such rules, then halt. - Replace the leftmost
xinwbyy. - If the found rule was a halt rule, then halt. Otherwise, go to step 1.
For example, consider this Markov algorithm in the alphabet {0, 1, x} (ε is the empty word):
1=0x
x0=0xx
0=ε
If we apply it to the string 110 we get this sequence of strings:
110 -> 0x10 -> 0x0x0 -> 00xxx0 -> 00xx0xx -> 00x0xxxx -> 000xxxxxx -> 00xxxxxx -> 0xxxxxx -> xxxxxx
In general, this algorithm converts a binary representation of a number into its unary representation.
Markov's student Vilnis Detlovs proved that for any Turing machine there exists a Markov algorithm that computes the same function. In comparison, grammars are unordered sets of rewrite rules and L-systems are rewrite rules that are applied in parallel. For more interesting examples of Markov algorithms check Markov's book or see the greatest common divisor example in the comment section or multiplication example on Wikipedia.
How would one generalize Markov algorithms to multiple dimensions? First, in multiple dimensions there are no natural ways to insert a string into another string, so the lefts and rights of our rewrite rules should have the same size. Second, there are no natural ways to choose the leftmost match. Possible options are:
- Choose a random match. This is what MJ's
(exists)nodes do. - Choose all matches. There is a problem with this option however because different matches can overlap and have conflicts. Possible solutions are:
- Greedily choose a maximal subset of non-conflicting matches. This is what MJ's
{forall}nodes do. - Consider all matches in superposition. That is, instead of separate values, keep waves in each grid cell - boolean vectors that tell which spacetime patterns are forbidden and which are not. And this is how MJ performs inference.
- Greedily choose a maximal subset of non-conflicting matches. This is what MJ's
We lose Turing completeness because our new procedure is not deterministic, but practice shows that this formalism still allows to describe a huge range of interesting random processes.
Rewrite rules
The simplest MarkovJunior program is probably (B=W). It contains just a single rule B=W. On each turn, this program converts a random black square into a white square.
Growth model (WB=WW) is more interesting. On each turn it replaces a black-white pair of adjacent cells BW with a white-white pair WW. In other words, on each turn it picks a random black cell adjacent to some white cell and color it into white. This model is almost identical to the Eden growth model: on each turn both models choose among the same set of black cells. They differ only in probability distributions: a uniform distribution over black cells adjacent to white cells is not the same as a uniform distribution over pairs of adjacent black and white cells.
Model (WBB=WAW) generates a maze, with a single line of code! Compare it with an implementation in a conventional language. Any MarkovJunior model can be run in any number of dimensions without changes. On the right you can see the end result of MazeGrowth in 3d, rendered in MagicaVoxel. By default, we use PICO-8 palette:
Model (RBB=WWR) is a self-avoiding random walk. Note that self-avoiding walks in 3d are longer on average than in 2d. In general, comparing the behaviors of similar random processes in different dimensions is a fascinating topic. A classic result of George Pólya says that a random walk in 2d returns to its initial position with probability one, while in 3d this is no longer the case.
We can put several rules into one rulenode. For example, (RBB=WWR RBW=GWP PWG=PBU UWW=BBU UWP=BBR) is a loop-erased random walk. Trail model (RB=WR RW=WR) generates decent connected caves.
Model (RBB=WWR R*W=W*R) is known as the Aldous-Broder maze generation algorithm. The wildcard symbol * in the input means that any color is allowed to be in the square. The wildcard symbol in the output means that the color doesn't change after the application of the rule. Aldous-Broder algorithm takes much more turns on average to generate a maze than MazeGrowth, for example, but it has a nice property that MazeGrowth doesn't have: each maze has the same probability to be generated. In other words, MazeTrail is an unbiased maze generation algorithm, or it samples mazes (or spanning trees) with the uniform distribution. Wilson's algorithm is a more efficient unbiased maze generation algorithm. Compare its MarkovJunior implementation with an implementation in a conventional language!
Combining rulenodes
We can put several rulenodes into a sequence node, to be run one after the other. In the River model we first construct a stochastic Voronoi diagram with 2 sources, and use the boundary between the formed regions as a base for a river. Then we spawn a couple more Voronoi seeds to grow forests and simultaneously grow grass from the river. As a result, we get random river valleys!
<p align="center"> <a h