SkillAgentSearch skills...

STOKE

A stochastic superoptimizer and program synthesizer

Install / Use

/learn @wuiw/STOKE
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

STOKE

STOKE is a stochastic optimizer for the x86-64 instruction set. STOKE uses random search to explore the extremely high-dimensional space of all possible program transformations. Although any one random transformation is unlikely to produce a code sequence that is both correct and an improvement over the original, the repeated application of millions of transformations is sufficient to produce novel and non-obvious code sequences that have been shown to outperform the code produced by general-purpose and domain-specific compilers, and in some cases expert hand-written code.

STOKE has appeared in a number of publications. For a thorough introduction to the design of STOKE, see:

Table of Contents

  1. Prerequisites
  2. Choosing a STOKE version
  3. Building STOKE
  4. Using the formal validator
  5. Using STOKE
  6. Additional Features
  7. Extending STOKE
  8. Code Organization
  9. Gadgets
  10. Initial Search State
  11. Search Transformations
  12. Cost Function
  13. Live-out Error
  14. Verification Strategy
  15. Command Line Args
  16. Contact

Prerequisites

STOKE will run on modern 64-bit x86 processors. We officially support Haswell processors with AVX2 extensions. STOKE should also run on Sandy Bridge systems (with AVX, but not AVX2). It might run on newer architectures, but we currently don't test these. Running ./configure.sh as described next will automatically configure the build for the correct architecture.

STOKE is supported on the latest Ubuntu LTS release; in practice, it will also run on Ubuntu 13.10+ and on debian testing. If you're trying to get STOKE to work on another linux distribution, having the right version of g++ is key. STOKE is supported on 4.9 only. It should also work on later versions, but it will not work with g++ 4.8.x (missing regular expression support), and g++ 4.7.x and older definitely will not work.

Most of STOKE's software dependencies are available through apt. These can be satisfied by typing:

$ sudo apt-get install bison ccache cmake doxygen exuberant-ctags flex g++-4.9 g++-multilib gcc-4.9 ghc git libantlr3c-dev libboost-dev libboost-filesystem-dev libboost-thread-dev libcln-dev libghc-regex-compat-dev libghc-regex-tdfa-dev libghc-split-dev libjsoncpp-dev python subversion libiml-dev libgmp-dev

Note that your distribution might not have g++-4.9 by default. You may consider installing a PPA as described here: https://askubuntu.com/questions/466651/how-do-i-use-the-latest-gcc-4-9-on-ubuntu-14-04

The rest of the dependencies will be fetched automatically as part of the build process.

Downloading STOKE

The entire STOKE code base is available on GitHub under the Apache Software License version 2.0 at github.com/StanfordPL/stoke.

To check it out, type:

$ git clone https://github.com/StanfordPL/stoke

This will check out the default develop branch. Unless you are looking for a specific version or modification of STOKE, this is the branch to use. It contains all the latest changes and is reasonably stable. This branch is supposed to always pass all tests.

Building STOKE

See the previous sections on how to download STOKE, a list of dependencies, and to check your hardware support level. The remainder of STOKE's software dependencies are available on GitHub and will be downloaded automatically the first time that STOKE is built. When you build STOKE the first time, run

$ ./configure.sh

This will figure out the correct build parameters (such as the platform). To build STOKE, run

$ make

To add STOKE and its related components to your path, type:

$ export PATH=$PATH:/<path_to_stoke>/bin

To run the tests, choose the appropriate command:

$ make test

The files generated during the build process can be deleted by typing:

$ make clean

To delete STOKE's dependencies as well (this is useful if an error occurs during the first build), type:

$ make dist_clean

Using STOKE

The following toy example shows a typical workflow for using STOKE. All of the following code can be found in the examples/tutorial/ directory. As this code is tested using our continuous integration system, the code there will always be up-to-date, but this README can fall behind.

Consider a C++ program that repeatedly counts the number of bits (population count) in the 64-bit representation of an integer. (Keeping track of a running sum prevents g++ from eliminating the calls to popcnt() altogether.)

// main.cc

#include <cstdlib>
#include <stddef.h>
#include <stdint.h>

using namespace std;

size_t popcnt(uint64_t x) {
  int res = 0;
  for ( ; x > 0; x >>= 1 ) {
    res += x & 0x1ull;
  }
  return res;
}

int main(int argc, char** argv) {
  const auto itr = atoi(argv[1]);

  auto ret = 0;
  for ( auto i = 0; i < itr; ++i ) {
    ret += popcnt(i);
  }

  return ret;

STOKE is a compiler and programming language agnostic optimization tool. It can be applied to any x86-64 ELF binary. Although this example uses the GNU toolchain, nothing prevents the use of other tools. To build this code with full optimizations, type:

$ g++ -std=c++11 -O3 -fno-inline main.cc

To measure runtime, type:

$ time ./a.out 100000000

real  0m1.046s
user  0m1.047s
sys   0m0.000s

A profiler will reveal that the runtime of ./a.out is dominated by calls to the popcnt() function. STOKE can be used to improve the implementation of this function as follows. The first step is to disassemble the program by typing:

$ stoke extract -i ./a.out -o bins

This will produce a directory named bins that contains the text of every function contained in the binary ./a.out.

Help for stoke or any of its subcommands can be obtained by typing:

$ stoke -h
$ stoke <subcommand> -h

STOKE can accept arguments either through the command line or through a configuration file. The invocation of stoke extract shown above is equivalent to the following:

$ stoke extract --config extract.conf

Where extract.conf contains:

##### stoke extract config file

-i ./a.out # Path to the elf binary to disassemble
-o bins # Path to the directory to store disassembled text in

Every STOKE subcommand can be used to generate example configuration files by typing:

$ stoke <subcommand> --example_config <path/to/file.conf>

Because main.cc was compiled using g++, the text of the popcnt() function will appear under the mangled name _Z6popcntm in bins/_Z6popcntm.s.

  .text
  .globl _Z6popcntm
  .type _Z6popcntm, @function
_Z6popcntm:
  xorl   %eax,%eax
  testq  %rdi,%rdi
  je     .L_4005b0
  nop
.L_4005a0:
  movq   %rdi,%rdx
  andl   $0x1,%edx
  addq   %rdx,%rax
  shrq   $0x1,%rdi
  jne    .L_4005a0
  retq
.L_4005b0:
  retq
  nop
  nop
  .size _Z6popcntm, .-_Z6popcntm

The next step is to generate a set of testcases for guiding STOKE's search procedure. These can be obtained by typing:

$ stoke testcase --config testcase.conf

where testcase.conf contains:

##### stoke testcase config file

--bin ./a.out # The name of the binary to use to generate testcases 
--args 10000000 # Command line arguments that should be passed to ./a.out
--functions bins # Disassembly directory created by stoke extract

-o popcnt.tc # Path to file to write testcases to

--fxn _Z6popcntm # The name of the function to generate testcases for
--max_testcases 1024 # The maximum number of testcases to generate. 

The resulting file will contain 1024 entires, all of the form:

Testcase 0:

%rax     00 00 00 00 00 98 96 80
%rcx     00 00 00 00 00 00 00 00
%rdx     00 00 00 00 00 00 00 0a
%rbx     00 00 00 00 00 00 00 01
%rsp     00 00 7f ff 97 44 36 28
%rbp     00 00 00 00 00 00 00 00
%rsi     19 99 99 99 99 99 99 99
%rdi     00 00 00 00 00 00 00 00
%r8      00 00 2a c9 68 1a 50 40
%r9      00 00 7f ff 97 44 46 01
%r10     00 00 00 00 00 98 96 80
%r11     00 00 00 00 00 00 00 0a
%r12     00 00 00 00 00 98 96 80
%r13     00 00 7f ff 97 44 37 20
%r14     00 00 00 00 00 00 00 00
%r15     00 00 00 00 00 00 00 00

%ymm0    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff 00 00
%ymm1    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2f 2f 2f 2f 2f 2f 2f 2f 2f 2f 2f 2f 2f 2f 2f 2f
%ymm2    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
%ymm3    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff 00 00 00 00 00 00 00 ff
%ymm4    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
%ymm5    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
%ymm6    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
%ymm7    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
%ymm8    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

Related Skills

View on GitHub
GitHub Stars8
CategoryDevelopment
Updated2y ago
Forks1

Languages

C++

Security Score

70/100

Audited on Nov 18, 2023

No findings