SkillAgentSearch skills...

StaticTools.jl

Enabling StaticCompiler.jl-based compilation of (some) Julia code to standalone native binaries by avoiding GC allocations and llvmcall-ing all the things!

Install / Use

/learn @brenhinkeller/StaticTools.jl
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

StaticTools

[![Docs][docs-dev-img]][docs-dev-url] [![CI][ci-img]][ci-url] [![CI (Integration)][ci-integration-img]][ci-integration-url] [![CI (Julia nightly)][ci-nightly-img]][ci-nightly-url] [![CI (Integration nightly)][ci-integration-nightly-img]][ci-integration-nightly-url] [![Coverage][codecov-img]][codecov-url]

Tools to enable StaticCompiler.jl-based static compilation of Julia code (or more accurately, a subset of Julia which we might call "unsafe Julia") to standalone native binaries by avoiding GC allocations and llvmcall-ing all the things! (Experimental! 🐛)

This package currently requires Julia 1.8 or greater for best results (if in doubt, check which versions are passing CI). Integration tests against StaticCompiler.jl and LoopVectorization.jl are currently run with Julia 1.8 and 1.9 on x86-64 linux and mac; other platforms and versions may or may not work but will depend on StaticCompiler.jl support.

While we'll do our best to keep things working, this package should still be considered experimental at present, and necessarily involves a lot of juggling of pointers and such (i.e., "unsafe Julia"). If there are errors in any of the llvmcalls (which we have to use instead of simpler ccalls for things to statically compile smoothly), there could be serious bugs or even undefined behavior. Please report any unexpected bugs you find, and PRs are welcome!

In addition to the exported names, Julia Base functions extended for StaticTools types (i.e., StaticString/ MallocString and StackArray/MallocArray) include:

  • print, println, error,
  • parse, read, write
  • rand/rand! (when using an rng initialied with static_rng, SplitMix64, or Xoshiro256✴︎✴︎ )
  • randn/randn! (when using an rng initialied with MarsagliaPolar, BoxMuller, or Ziggurat )
  • and much or all of the AbstractArray and AbstractString interfaces where relevant.

StaticTools.jl provides a large zoo of types and functions for dealing with arrays and strings, but the main way we recommend you create and manipulate arrays is by using the MallocSlabBuffer exported by StaticTools together with @no_escape, @alloc, and @alloc_ptr which are also exported from StaticTools, but defined in Bumper.jl.

Beyond that, there are the stack-allocated statically-sized StaticStrings and StackArrays in this package which are heavily inspired by the techniques used in JuliaSIMD/ManualMemory.jl; you can use that package via StrideArraysCore.jl or StrideArrays.jl to obtain fast stack-allocated statically-sized arrays which should also be StaticCompiler-friendly, up to the stack limit size. For larger arrays where you want direct control over their memory management, space may be allocated with malloc, as in MallocArrays. However, as in any other language, any memory malloced must be freed once and only once (as opposed to using MallocSlabBuffer together with the Bumper.jl interface will deal with allocation and freeing of memory for you).

Mandelbrot Set in the terminal with compiled Julia printmandel.jl

Limitations:

In order to be standalone-compileable without linking to libjulia, you need to avoid (among probably other things):

  • GC allocations. Manual heap-allocation (malloc, calloc) and stack allocation (by convincing the Julia compiler to use alloca and put your object on the stack) are all fine though.
  • Non-constant global variables
  • Type instability.
  • Anything that could cause an InexactError or OverflowError -- so x % Int32 may work in some cases when Int32(x) may not.
  • Anything that could cause a BoundsError -- so @inbounds (or else julia --check-bounds=no) is mandatory. Consequently, @inbounds is always on for MallocArrays and StackArrays; be sure to treat them accordingly when indexing!
  • Functions that don't want to inline (can cause sneaky allocations due to boxing) -- feel free to use @inline liberally to avoid.
  • Multithreading
  • Microsoft Windows (StaticCompiler supports Windows by now, the minor limitations are in this package), except via WSL. E.g. read "works" on Windows (and WSL), but the result of it can't be printed, likely related to UTF-16, or some functions assuming or or UTF-8 and not others.

This package can help you with avoiding some of the above, but you'll still need to be quite careful in how you write your code! I'd recommend starting small and adding features slowly.

On the other hand, a surprising range of higher-order language features will work (e.g., multiple dispatch, metaprogramming) as long as they can happen before compile-time.

While, as noted above, manually allocating your own memory on the heap with malloc or calloc and operating on that memory via pointers will work just fine (as is done in MallocArrays and MallocStrings), by doing this we have effectively stepped into a subset of Julia which we might call "unsafe Julia" -- the same subset you step into when you interact with C objects in Julia, but also one which means you're dealing with objects that don't follow the normal Julia object model. 👻

Fortunately, going to all this trouble does have some side benefits besides compileability:

  • Type instability is one of the biggest sources of unnecessarily bad performance in naive Julia code, especially when you're new to multiple dispatch -- well, won't be able to make that mistake by accident here!
  • No GC means no GC pauses
  • Since we're only including what we need, binaries can be quite small (e.g. 8.4K for Hello World)

Utilities

The utilities static_type and static_type_contents are utilities to help convert an object to something similar with fields and type parameters that are amenable to static compilation.

static_type is mainly useful for converting objects that are heavily paramaterized. The SciML infrastructure has a lot of this. The main objects like a DiffEq.Integrator has many type parameters, and by default, some are not amenable to static compilation. static_type can be used to convert them to forms that can help numerical code to be statically compiled.

For the default rules, Arrays are converted to MallocArrays, and Strings are converted to MallocStrings. The default rules can be extended or redefined by using multiple dispatch and a context variable. Note however that these MallocArrays and MallocStrings must be freed when you are done with them.

Examples

Compiled command-line executables

Simple command-line executable with variable arguments:

# This is all StaticCompiler-friendly
using StaticTools

function print_args(argc::Int, argv::Ptr{Ptr{UInt8}})
    # c"..." lets you construct statically-sized, stack allocated `StaticString`s
    # We also have m"..." and MallocString if you want the same thing but on the heap
    printf(c"Argument count is %d:\n", argc)
    for i=1:argc
        # iᵗʰ input argument string
        pᵢ = unsafe_load(argv, i) # Get pointer
        strᵢ = MallocString(pᵢ) # Can wrap to get high-level interface
        println(strᵢ)
        # No need to `free` since we didn't allocate this memory
    end
    println(c"That was fun, see you next time!")
    return 0
end

# Compile executable
using StaticCompiler
filepath = compile_executable(print_args, (Int64, Ptr{Ptr{UInt8}}), "./")

and...

shell> ./print_args 1 2 3 4 5.0 foo
Argument count is 7:
./print_args
1
2
3
4
5.0
foo
That was fun, see you next time!

shell> hyperfine './print_args hello there'
Benchmark 1: ./print_args hello there
  Time (mean ± σ):       2.6 ms ±   0.5 ms    [User: 0.9 ms, System: 0.0 ms]
  Range (min … max):     1.8 ms …   5.9 ms    542 runs

  Warning: Command took less than 5 ms to complete. Results might be inaccurate.

shell> ls -lh $filepath
  -rwxr-xr-x  1 user  staff   8.4K May 22 13:58 print_args

Note that the resulting executable is only 8.4 kilobytes in size!

Using MallocSlabBuffer for memory management

using StaticTools
function times_table(argc::Int, argv::Ptr{Ptr{UInt8}})
    argc == 3 || return printf(c"Incorrect number of command-line arguments\n")
    rows = argparse(Int64, argv, 2)            # First command-line argument
    cols = argparse(Int64, argv, 3)            # Second command-line argument

    buf = MallocSlabBuffer()
    @no_escape buf begin
        M = @alloc(Int, rows, cols)
        for i=1:rows
            for j=1:cols
                M[i,j] = i*j
            end
        end
        printf(M)
    end
    free(buf)
end

using StaticCompiler
filepath = compile_executable(times_table, (Int64, Ptr{Ptr{UInt8}}), "./")

giving

shell> ls -lh $filepath
-rwxr-xr-x 1 mason mason 16K Nov 15 19:10 times_table

shell> ./times_table 12, 7
1   2   3   4   5   6   7
2   4   6   8   10  12  14
3   6   9   12  15  18  21
4   8   12  16  20  24  28
5   10  15  20  25  30  35
6   12  18  24  30  36  42
7   14  21  28  35  42  49
8   16  24  32  40  48  56
9   18  27  36  45  54  63
10  20  30  40  50  60  70
11  22  33  44  55  66  77
12  24  36  48  60  72  84

MallocArrays with size determined at runtime:

If we want to have dynamically-sized arrays, we'll have to allocate them ourselves. The MallocArray type is one way to do that.

using StaticTools
function times_table_malloc(argc::Int, argv::Ptr{Ptr{UInt8}})
    argc == 3 || return printf(c"Incorrect number of command-line arguments\n")
    rows = argparse(Int64, argv, 2)            # First command-line argument
    cols = argparse(
View on GitHub
GitHub Stars180
CategoryDevelopment
Updated1mo ago
Forks13

Languages

Julia

Security Score

100/100

Audited on Feb 26, 2026

No findings