STOKE
A stochastic superoptimizer and program synthesizer
Install / Use
/learn @wuiw/STOKEREADME
STOKE
STOKE is a stochastic optimizer for the x86-64 instruction set. STOKE uses random search to explore the extremely high-dimensional space of all possible program transformations. Although any one random transformation is unlikely to produce a code sequence that is both correct and an improvement over the original, the repeated application of millions of transformations is sufficient to produce novel and non-obvious code sequences that have been shown to outperform the code produced by general-purpose and domain-specific compilers, and in some cases expert hand-written code.
STOKE has appeared in a number of publications. For a thorough introduction to the design of STOKE, see:
- Stochastic Superoptimization -- ASPLOS 2013:
- Data-Driven Equivalence Checking -- OOPSLA 2013:
- Stochastic Optimization of Floating-Point Programs with Tunable Precision -- PLDI 2014:
- Conditionally Correct Superoptimization -- OOPSLA 2015:
Table of Contents
- Prerequisites
- Choosing a STOKE version
- Building STOKE
- Using the formal validator
- Using STOKE
- Additional Features
- Extending STOKE
- Code Organization
- Gadgets
- Initial Search State
- Search Transformations
- Cost Function
- Live-out Error
- Verification Strategy
- Command Line Args
- Contact
Prerequisites
STOKE will run on modern 64-bit x86 processors. We officially support Haswell
processors with AVX2 extensions. STOKE should also run on Sandy Bridge
systems (with AVX, but not AVX2). It might run on newer architectures, but we currently don't test these.
Running ./configure.sh as described next will automatically configure the build for the correct architecture.
STOKE is supported on the latest Ubuntu LTS release; in practice, it will also run on Ubuntu 13.10+ and on debian testing. If you're trying to get STOKE to work on another linux distribution, having the right version of g++ is key. STOKE is supported on 4.9 only. It should also work on later versions, but it will not work with g++ 4.8.x (missing regular expression support), and g++ 4.7.x and older definitely will not work.
Most of STOKE's software dependencies are available through apt. These can be satisfied by typing:
$ sudo apt-get install bison ccache cmake doxygen exuberant-ctags flex g++-4.9 g++-multilib gcc-4.9 ghc git libantlr3c-dev libboost-dev libboost-filesystem-dev libboost-thread-dev libcln-dev libghc-regex-compat-dev libghc-regex-tdfa-dev libghc-split-dev libjsoncpp-dev python subversion libiml-dev libgmp-dev
Note that your distribution might not have g++-4.9 by default. You may consider installing a PPA as described here: https://askubuntu.com/questions/466651/how-do-i-use-the-latest-gcc-4-9-on-ubuntu-14-04
The rest of the dependencies will be fetched automatically as part of the build process.
Downloading STOKE
The entire STOKE code base is available on GitHub under the Apache Software License version 2.0 at github.com/StanfordPL/stoke.
To check it out, type:
$ git clone https://github.com/StanfordPL/stoke
This will check out the default develop branch. Unless you are looking for a specific version or modification of STOKE, this is the branch to use. It contains all the latest changes and is reasonably stable. This branch is supposed to always pass all tests.
Building STOKE
See the previous sections on how to download STOKE, a list of dependencies, and to check your hardware support level. The remainder of STOKE's software dependencies are available on GitHub and will be downloaded automatically the first time that STOKE is built. When you build STOKE the first time, run
$ ./configure.sh
This will figure out the correct build parameters (such as the platform). To build STOKE, run
$ make
To add STOKE and its related components to your path, type:
$ export PATH=$PATH:/<path_to_stoke>/bin
To run the tests, choose the appropriate command:
$ make test
The files generated during the build process can be deleted by typing:
$ make clean
To delete STOKE's dependencies as well (this is useful if an error occurs during the first build), type:
$ make dist_clean
Using STOKE
The following toy example shows a typical workflow for using STOKE. All of the
following code can be found in the examples/tutorial/ directory. As this
code is tested using our continuous integration system, the code there will
always be up-to-date, but this README can fall behind.
Consider a
C++ program that repeatedly counts the number of bits (population count) in the
64-bit representation of an integer. (Keeping track of a running sum prevents
g++ from eliminating the calls to popcnt() altogether.)
// main.cc
#include <cstdlib>
#include <stddef.h>
#include <stdint.h>
using namespace std;
size_t popcnt(uint64_t x) {
int res = 0;
for ( ; x > 0; x >>= 1 ) {
res += x & 0x1ull;
}
return res;
}
int main(int argc, char** argv) {
const auto itr = atoi(argv[1]);
auto ret = 0;
for ( auto i = 0; i < itr; ++i ) {
ret += popcnt(i);
}
return ret;
STOKE is a compiler and programming language agnostic optimization tool. It can be applied to any x86-64 ELF binary. Although this example uses the GNU toolchain, nothing prevents the use of other tools. To build this code with full optimizations, type:
$ g++ -std=c++11 -O3 -fno-inline main.cc
To measure runtime, type:
$ time ./a.out 100000000
real 0m1.046s
user 0m1.047s
sys 0m0.000s
A profiler will reveal that the runtime of ./a.out is dominated by calls to
the popcnt() function. STOKE can be used to improve the implementation of
this function as follows. The first step is to disassemble the program by
typing:
$ stoke extract -i ./a.out -o bins
This will produce a directory named bins that contains the text of every
function contained in the binary ./a.out.
Help for stoke or any of its subcommands can be obtained by typing:
$ stoke -h
$ stoke <subcommand> -h
STOKE can accept arguments either through the command line or through a
configuration file. The invocation of stoke extract shown above is equivalent
to the following:
$ stoke extract --config extract.conf
Where extract.conf contains:
##### stoke extract config file
-i ./a.out # Path to the elf binary to disassemble
-o bins # Path to the directory to store disassembled text in
Every STOKE subcommand can be used to generate example configuration files by typing:
$ stoke <subcommand> --example_config <path/to/file.conf>
Because main.cc was compiled using g++, the text of the popcnt() function
will appear under the mangled name _Z6popcntm in bins/_Z6popcntm.s.
.text
.globl _Z6popcntm
.type _Z6popcntm, @function
_Z6popcntm:
xorl %eax,%eax
testq %rdi,%rdi
je .L_4005b0
nop
.L_4005a0:
movq %rdi,%rdx
andl $0x1,%edx
addq %rdx,%rax
shrq $0x1,%rdi
jne .L_4005a0
retq
.L_4005b0:
retq
nop
nop
.size _Z6popcntm, .-_Z6popcntm
The next step is to generate a set of testcases for guiding STOKE's search procedure. These can be obtained by typing:
$ stoke testcase --config testcase.conf
where testcase.conf contains:
##### stoke testcase config file
--bin ./a.out # The name of the binary to use to generate testcases
--args 10000000 # Command line arguments that should be passed to ./a.out
--functions bins # Disassembly directory created by stoke extract
-o popcnt.tc # Path to file to write testcases to
--fxn _Z6popcntm # The name of the function to generate testcases for
--max_testcases 1024 # The maximum number of testcases to generate.
The resulting file will contain 1024 entires, all of the form:
Testcase 0:
%rax 00 00 00 00 00 98 96 80
%rcx 00 00 00 00 00 00 00 00
%rdx 00 00 00 00 00 00 00 0a
%rbx 00 00 00 00 00 00 00 01
%rsp 00 00 7f ff 97 44 36 28
%rbp 00 00 00 00 00 00 00 00
%rsi 19 99 99 99 99 99 99 99
%rdi 00 00 00 00 00 00 00 00
%r8 00 00 2a c9 68 1a 50 40
%r9 00 00 7f ff 97 44 46 01
%r10 00 00 00 00 00 98 96 80
%r11 00 00 00 00 00 00 00 0a
%r12 00 00 00 00 00 98 96 80
%r13 00 00 7f ff 97 44 37 20
%r14 00 00 00 00 00 00 00 00
%r15 00 00 00 00 00 00 00 00
%ymm0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff 00 00
%ymm1 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2f 2f 2f 2f 2f 2f 2f 2f 2f 2f 2f 2f 2f 2f 2f 2f
%ymm2 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
%ymm3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff 00 00 00 00 00 00 00 ff
%ymm4 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
%ymm5 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
%ymm6 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
%ymm7 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
%ymm8 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Related Skills
node-connect
344.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
99.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
344.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
344.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
