SkillAgentSearch skills...

Rftrace

The Rust Function Tracer.

Install / Use

/learn @hermit-os/Rftrace
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<!-- omit in toc -->

rftrace - Rust Function Tracer

rftrace is a rust based function tracer. It provides both a backend, which does the actual tracing, and a frontend which write the traces to disk. The backend is designed to standalone and not interact with the system. As such it can be used to partially trace a kernel like the Hermit kernel though OS, interrupts, stdlib and application. Multiple threads are supported. It also works for normal Rust and C applications, though better tools exist for that usecase.

Requires a recent nightly rust compiler (as of 28-6-2020).

Table of Contents

Design

I was in need of a function tracer, which works in both kernel and userspace to trace a Hermit application. Preferably without manually annotating source code, as a plug and play solution. Since Hermit also has a gcc toolchain, it should work with applications instrumented with both rustc and gcc.

The best way to do this is to use the function instrumentation provided by the compilers, where they insert mcount() calls in each function prologue. This is possible in gcc with the -pg flag, and in rustc with the newly added -Z instrument-mcount flag. The same mechanism is used with success by eg uftrace, which already provides Rust Support.

This tracer is split into two parts: a backend and a frontend.

The backend is a static library which provides said mcount() call and is responsible for logging every function entry and exit into a buffer. It is written in Rust, but is no_std and even no alloc. Unlike uftrace, it does not rely on any communication with external software (such as the OS for eg thread-ids). It does require thread-local-storage though.

Since it is compiled separately as a static library, we can even use a different target architecture. This is needed to easily embed the library into our application, which is for example allowed to use SSE registers. These will cause an abort when used in the wrong situations in the kernel though! By compiling the staticlib against a kernel-target, we avoid this issue and can trace kernel and userspace simultaneously. Another reason for this sub-compilation is, that unlike gcc, rust does not provide a mechanism do selectively disable instrumentation yet. We cannot instrument the mcount function itself, else we get infinite recursion.

The frontend interfaces with the backend via a few function calls. It provides the backend with an event-buffer (needed since backend is no-alloc), and is responsible for saving the traces once done. In theory it is easily replacable with your own, but the API is not yet fleshed out.

Dependencies

The function-prologues of the traced application have to be instrumented with mcount. This can be done with the rustc compiler option -Z instrument-mcount, or gcc's -pg flag.

The backend implicitly assumes a System-V ABI. This affects what registers need to be saved and restored on each function entry and exit, and how funciton-exit-hooking is done. If you use a different convention, check if mcount() and mcount_return_trampoline() handle the correct registers.

For the logging of callsites and function exits, frame pointers are needed, so make sure your compiler does not omit them as an optimization.

For tracing kernel+application in one trace, a single-address-space OS like HermitCore is needed. Not all functions can currently be hooked. Naked functions are somewhat broken. Hooking interrupts is broken aswell and will lead to intermittent crashes. Unfortunately, the Rust compiler does have no mechanism to opt-out of mcount instrumentation for specific functions, so you have to take care to only enable rftrace in allowed contexts. Currently only runs cleanly if exactly one cpu core is available.

There are no other dependencies required for recording a trace. The output format is the same as the one used by uftrace, so you will need it to view and convert it. There are (currently out-of-date) scripts which can merge traces from multiple different sources in /tools, these need python3.

When tracing a custom kernel, it needs to provide the capability to write files into a directory, otherwise we cannot save the trace. It also needs to support thread-local-storage, since we use it as a shadow-return-stack and thread-id allocation.

Usage

There are 4 usage examples in /examples: Rust and C, both on normal Linux x64 and Hermit. These are the only tested architectures.

Adding rftrace to your application

Linux Rust application

To use rftrace, add both the backend and a frontend to your dependencies.

[dependencies]
rftrace = "0.3"
rftrace-frontend = "0.3"

Ensure that frame pointers are generated! Debug build always seem to have them enabled.

Enable -Z instrument-mcount, by either setting environment variable RUSTFLAGS="-Z instrument-mcount", or by including in .cargo/config:

[build]
rustflags=["-Z", "instrument-mcount"]

When using vscode, the first can easily be done by modifying your compile task to include

"options": {
    "env": {
        "RUSTFLAGS": "-Z instrument-mcount",
    }
},

To actually do the tracing, you have to also add some code to your crate, similar to the following

fn main() {
    let events = rftrace::init(1000000, true);
    rftrace::enable();

    run_tests();

    rftrace::dump_full_uftrace(events, "/trace", "binaryname", false)
        .expect("Saving trace failed");
}

Hermit

When tracing Hermit, the backend is linked directly to the kernel. This is enabled with the instrument feature of the hermit crate. Therefore we only need the frontend in our application. By using the instrument feature, the kernel is always instrumented. To additionally log functions calls of your application, set the instrument-mcount rustflag as seen above.

I further suggest using at least opt-level 2, else a lot of useless clutter will be created by the stdlib. (we are building it ourselves here with -Z build-std=std,... so it is affected by the instrument rustflag!)

An example with makefile, which does all the needed trace gathering, timing conversions and kvm-event merging to get a nice trace is provided in /examples/hermitrust, and can be compiled and run with make runkvm

[dependencies]
hermit = { version = "0.8", default-features = false, features = ["instrument"] }
rftrace = "0.3"

Any other kernel

Backend features which might be of interest are:

  • interruptsafe - will safe and restore more registers on function exits, to ensure interrupts do not clobber them. Probably only needed when interrupts are instrumented. Can be disabled for performance reasons.

Output Format

The frontend outputs a trace folder compatible to uftrace: uftrace's Data Format.

Note that the time will be WRONG, since we output it in raw TSC counts, and not nanoseconds. You could convert this by determining the TSC frequency and using merge.py. Also see: Time alignment Guest <-> Host.

Also note that TID's are not the ones assigned by the host. The backend, having no dependencies at all, does not query TID's, but assigns it's own. The first thread it sees will get TID 1, the second 2..

The full trace consists of 5+ files, 4 for metadata plus 1 per TID which contains the actual trace:

  • /<TID>.dat: contains trace of thread TID. Might be multiple if multithreaded
  • /info: general info about cpu, mem, cmdline, version
  • /task.txt: contains PID, TID, SID<->exename mapping
  • /sid-<SID>.map: contains mapping of addr to exename. By default, the memory map is faked. You can enable linux-mode, in which case /proc/self/maps is copied.
  • /<exename>.sym: contains symbols of exe, like output of nm --demangle -n (has to be sorted!). Symbols are never generated and always have to be done by hand.

Chrome trace viewer

A very nice way to visualize the trace is using the chrome trace viewer. It can show custom json traces, similar to a flamegraph but interactive. uftrace can convert to this format with uftrace dump --chrome > trace.json

  • 'Legacy' Interface: open chrome, go to chrome://tracing. This opens an interface called catapult.
  • 'Modern' Interface: Perfetto. Looks nicer, but has a limited zoom level.
  • For both, I su
View on GitHub
GitHub Stars47
CategoryDevelopment
Updated1mo ago
Forks7

Languages

Rust

Security Score

90/100

Audited on Feb 10, 2026

No findings