Piccolo
An experimental stackless Lua VM implemented in pure Rust
Install / Use
/learn @kyren/PiccoloREADME
piccolo - An experimental stackless Lua VM implemented in pure Rust
(After four years, now UN-paused!)
Project Goals, in roughly descending priority:
- Be an arguably working, useful Lua interpreter.
- Be an easy way to confidently sandbox untrusted Lua scripts.
- Be resilient against DoS from untrusted scripts (scripts should not be able to cause the interpreter to panic or use an unbounded amount of memory and should be guaranteed to return control to the caller in some bounded amount of time).
- Be an easy way to bind Rust APIs to Lua safely, with a bindings system that is resilient against weirdness and edge cases, and with user types that can safely participate in runtime garbage collection.
- Be pragmatically compatible with some version(s) of PUC-Rio Lua.
- Don't be obnoxiously slow (for example, avoid abstractions that would make the interpreter fundamentally slower than PUC-Rio Lua).
You read more about the design of piccolo (and try it out a live REPL!) in
this blog post.
API Instability
Expect frequent pre-1.0 API breakage, this crate is still very experimental. All API incompatible changes will be accompanied by minor version bumps, but these will be very common.
Safety
The goal with piccolo is to have the majority of it written in safe Rust.
Currently, there are a few sources of unsafety, but crucially these sources
of unsafety are isolated. piccolo will avoid at all costs relying on
abstractions which leak unsafety, it should always be possible to interact
with even low level details of piccolo without using unsafe.
The current primary sources of unsafety:
- The particularly weird requirements of Lua tables require using hashbrown's low level RawTable API.
- Userdata requires unsafety to allow for downcasting non-'static userdata with a safe interface.
- The implementation of async
Sequences require unsafety to "tunnel" the normalSequencemethod parameters into the future (this is completely hidden from the user behind a safe interface). - Unsafe code is required to avoid fat pointers in several Lua types, to keep
Valueas small as possible and allow potential future smallerValuerepresentations.
(piccolo makes no attempt yet to guard against side channel attacks like
spectre, so even if the VM is memory safe, running untrusted scripts may carry
additional risk. With no JIT or callback API to accurately measure time, this
might be practically impossible anwyay.)
A unique system for Rust <-> GC interaction
The garbage collector system for piccolo is now in its own repo, and also on crates.io. See the README in the
linked repo for more detail about the GC design.
piccolo has a real, cycle detecting, incremental garbage collector with
zero-cost Gc pointers (they are machine pointer sized and implement Copy)
that are usable from safe Rust. It achieves this by combining two things:
- An unsafe
Collecttrait which allows tracing through garbage collected types that, despite being unsafe, can be implemented safely using procedural macros. - Branding
Gcpointers by unique, invariant "generative" lifetimes to ensure that such pointers are isolated to a single root object, and to guarantee that, outside an active call tomutate, all such pointers are either reachable from the root object or are safe to collect.
Stackless VM
The mutate based GC API means that long running calls to mutate can be
problematic. No garbage collection can take place during a call to mutate, so
we have to make sure to regularly return from the mutate call to allow garbage
collection to take place.
The VM in piccolo is thus written in what is sometimes called "stackless"
or "trampoline" style. It does not rely on the rust stack for Lua -> Rust and
Rust -> Lua nesting, instead callbacks can either have some kind of immediate
result (return values, yield values from a coroutine, resume a thread, error),
or they can produce a Sequence. A Sequence is a bit like a Future in
that it is a multi-step operation that the parent Executor will drive to
completion. Executor will repeatedly call Sequence::poll until the sequence
is complete, and the Sequence can yield values and call arbitrary Lua
functions while it is being polled.
As an example, it is of course possible for Lua to call a Rust callback, which
then in turn creates a new Lua coroutine and runs it. In order to do so, a
callback would take a Lua function as a parameter, then create a new coroutine
Thread from it and return SequencePoll:Resume to run it. The outer main
Executor will run the created Thread, and when it is finished it will
"return" via Sequence::poll (or Sequence::error). This is exactly how the
coroutine.resume Lua stdlib function is implemented.
As another example, pcall is easy to implement here, a callback can call the
provided function with a Sequence underneath it, and the sequence can catch
the error and return the error status.
Yet another example, imagine Rust code calling a Lua coroutine thread which
calls a Rust Sequence which calls yet more Lua code which then yields. Our
stack will look something like this:
[Rust] -> [Lua Coroutine] -> [Rust Sequence] -> [Lua code that yields]
This is no problem with this VM style, the inner Rust callback is paused as a
Sequence, and the inner yield will return the value all the way to the top
level Rust code. When the coroutine thread is resumed and eventually returns,
the Rust Sequence will be resumed.
With any number of nested Lua threads and Sequences, control will always
continuously return outside the GC arena and to the outer Rust code driving
everything. This is the "trampoline" here, when using this interpreter,
somewhere there is a loop that is continuously calling Arena::mutate and
Executor::step, and it can stop or pause or change tasks at any time, not
requiring unwinding the Rust stack.
This "stackless" style has many benefits, it allows for concurrency patterns that are difficult in some other VMs (like tasklets), and makes the VM much more resilient against untrusted script DoS.
Async Sequences
The downside of the "stackless" style is that writing things as a Sequence
implementation is much more difficult than writing in normal, straight control
flow. This is identical to the problem Rust had before proper async support,
where it required implementing Future manually or using difficult to use
combinators. Ideally, if we could somehow implement Collect for the generated
state machine for a rust async block, then we could use rust async (or more
directly, unstable Rust coroutines) to implement our Sequence state machines.
Unfortunately, implementing a trait like this for a Rust async (coroutine) state
machine is not currently possible. HOWEVER, piccolo is currently still able to
provide a safe way to implement Sequence using async blocks by using a clever
trick: a shadow stack.
The async_sequence function can create a Sequence impl from an async
block, and the generated Future tells the outer sequence what actions to
take on its behalf. Since the Rust future cannot (safely) hold GC pointers
(since it cannot possibly implement Collect in today's Rust), we instead
allow it to hold proxy "stashed" values, and these "stashed" values point to
a "shadow stack" held inside the outer sequence which allows them to be traced
and collected properly! We provide a Locals object inside async sequences
and this is the future's "shadow stack"; it can be used to stash / fetch any
GC value and any values stashed using this object are treated as owned by the
outer Sequence. In this way, we end up with a Rust future that can store GC
values safely, both in the sense of being sound and not leading to dangling
Gc pointers, but also in a way that cannot possibly lead to things like
uncollectable cycles. It is slightly more inconvenient than if Rust async blocks
could implement Collect directly (it requires entering and exiting the GC
context manually and stashing / unstashing GC values), but it is MUCH easier
than manually implementing a custom Sequence state machine!
Using this, it is easy to write very complex Rust callbacks that can themselves
call into Lua or resume threads or yield values back to Lua (or simply return
control to the outermost Rust code), while also maintaining complex internal
state. In addition, these running callbacks are themselves proper garbage
collected values, and all of the GC values they hold will be collected if they
are (for example) forgotten as part of a suspended Lua coroutine. Without async
sequences, this would require writing complex state machines by hand, so this is
critical for very complex uses of piccolo.
Executor "fuel" and VM memory tracking
The stackless VM style "periodically" returns control to the outer Rust code driving everything, and how often this happens can be controlled using the "fuel" system.
Lua and Lua driven callback code always happens within some call to
Executor::step. This method takes a fuel parameter which controls how long
the VM should run before pausing, with fuel measured (roughly) in units of VM
instructions.
Different amounts of fuel provided to Executor::step bound the amount of Lua
execution that can occur, bounding both the CPU time used and also the amount of
memory allocation t
Related Skills
node-connect
340.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
340.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.2kCommit, push, and open a PR
