Timep
`timep` is an efficient and accurate state-of-the-art trap-based profiler and flamegraph generator for bash code. `timep` does much more than "providing per-command execution times" -- it maps the full call-stack tree for the bash code being profiled, and (optionally) uses that call-stack tree to generate a FlameGraph of the profiled bash commands!
Install / Use
/learn @jkool702/TimepREADME
timep
timep is an efficient and state-of-the-art trap-based time profiler for bash code. timep generates a per-command execution time profile for the bash code being profiled. As it generates this profile, timep logs command runtimes+metadata hierarchically based on both function and subshell nesting depth, mapping and recreating the complete full call-stack tree for the bash code being profiled.
CURRENT TIMEP VERSION: timep v1.10.1
The timep v1.10 release is a smaller "quality of life" release that incorporates the following changes:
/dev/shmis no longer a hard dependency. The loadable builtin timep.so file and the flamegraph generation perl script now follow the same logic that choosing the timep tmpdir uses (/dev/shmis preferred, but if unavailable$TMPDIR,/tmp, and$PWDwill be tried with decreasing preference)- The flamegraph generation workflow has been parallelized. After the parallel primary log processing finishes, flamegraph generation runs in parallel with final output profile generation (resulting in a much shorter time until the output profile is printed to the screen). Additionally, when the dual-stack and quad-stack flamegraphs are created the 4x dual stack ones are made in parallel and the 2x quad-stack ones also are made in parallel.
- The way
timepaggregates the compined time totals (shown at the bottom of the profiles) has been overhauled, making them more accurately describe the actual runtime (without instrumentation overhead). Three times are now shown:
- "SELF RUN TIME": the "wall-clock" time that it actually took the command to run (this is new)
- "TOTAL RUN TIME": the "wall-clock time" from all parallel branches of the code summed tog
timep v1.10.1: added a guard for BASH_ENV and a new stress test
See CHANGELOG.md for the changes introduced in previous timep updates. To use one of the older versions of timep, download its release or use it via its tag.
BUILTIN FLAMEGRAPH GENERATOR
One standout feature of timep is that, in addition to the time profile, timep will generate outputs consisting of call-stack traces that can be directly used with timep_flamegraph.pl (in this repo - a modified version of flamegraph.pl from Brendan Gregg's FlameGraph repo with a new --color=timep option for use with timep). If you pass timep the --flame flag, timep will automatically download (if needed) a copy of flamegraph.pl and use it to generate both "full" and a "folded" flamegraphs SVG images. However, unlike typical flamegraphs (which are built using stack traces), these flamegraphs are built using bash commands and their associated runtimes, and the different levels represent combined function+subshell nesting depth. Additionally, these flamegraphs use a custom 'timep' coloring scheme, which colors based on the time it took the command to run and uses a perceptually and spatially equalized color mapping to produce flamegraphs that are easy to interpret and use.
note: use the timep_GENERATE_FLAMEGRAPHS_BY_DEFAULT at the top of the code to control if you want timep to generate flamegraphs automatically by default (without requiring passing a flag). Current default is to automatically generate them.
USING TIMEP
USAGE: . /path/to/timep.bash; timep [-s|-f|-c] [-k] [-t] [-F|--flame] [-o <type>] [--] << SCRIPT/FUNCTION/COMMAND TO PROFILE >>
In other words, source timep.bash and then simply add timep before the function/script/commands you want to profile! The code being profiled needs ZERO changes to work with timep...timep handles everything for you! (including automatically redirecting stdin to the stdin of whatever is being profiled, when needed).
OUTPUTS: timep generates 2 time profiles and (if -F or --flame is passed) several flamegraph svg images plus 2 stack traces (flamegraph inputs), . These outputs are always saved to disk in the "profiles" directory in the timep tmpdir (by default: /dev/shm/.timep/timep-XXXXXXXX). Upon finishing, timep will create a symlink in your PWD at ./timep.profiles that links to the "profiles" dir that contains all the timep outputs.
DETAILS ON OUTPUTS:
2 time profiles: "out.profile.full" and "out.profile"
- out.profile.full: contains all individual commands and metadata info like the chain of FUNCNAME's and the chain of subshell PIDs
- out.profile: commands repeated by loops have been collapsed into combined entries that show the number of times the command was repeated and the total run time from all of them. By default this is printed to the screen upon completion.
if --flame is passed as a flag:
2 stack traces (intended to be passed to "timep_flamegraph.pl"): "out.flamegraph.full" and "out.flamegraph"
2 flamegraphs: out.flamegraph.ALL.svg and out.flamegraph.ALL.R.svg: there are both "quad stack": 4-in-1 flamegraphs. they contain the same info, but that info is grouped differently.
several flamegraph .svg files are genertated from the above two "out.flamegraph" files and savei in the "flamegraphs" subdirectory of the profile dir. there are 4 "base" SVG's that show wall-clock time and cpu time for the full and the folded stack traces. These 4 SVGs are then combined (vertically stacked) in various combinations to produce extremely informaive dual- and quad-stacked flamegraphs. The qaad-stacked flamegraph.ALL.svg and flamegraph.ALL.R.svg flamegraphs both contain all 4 "base flamegraphs" (they group them in dfferent ways), and are probably the ones you want to use.
NOTE ON INTERPRETING THE TOTAL RUNTIMES IN THE PROFILE:
- the "SELF RUN TIME" is the "wall-clock time" that it actually took the command to run. i.e., how long you had to wait after starting running the code until it finished.
- the "TOTAL RUN TIME" represents the combined sum of the "wall-clock time" from the main process being profiled + all of its bash descendant processes. If it has no descendants (i.e., it never forks a background process that runs asynchronously) then this is just the standard "wall-clock time". For code that runs several processes in parallel it is similiar to the "total CPU time (sys+user)", except that it combines the wall-clock time that each process ran for.
- The "TOTAL CPU TIME" is equivalent to the combined sys+user time from other timing tools.
- NOTE: timep's overhead has been removed/corrected for in all 3 of these times. each should be very close to the time you would have gotten if you ran the command without using
timep.
The big difference between the two "TOTAL" times is that:
- TOTAL RUN TIME includes time spent idling and waiting (via
wait, a blocking read, waiting on I/O, etc), when cpu usage was basically zero but the process was still running, and - if you call a binary (not a shell script) that is inherently multithreaded, TOTAL RUN TIME adds the time it waited for the binary to finish, and TOTAL CPU TIME adds the total cpu time used the binary used.
EXAMPLE
testfunc() {
trap 'echo RETURN' RETURN;
f() { echo "f: $*"; }
g() ( trap 'echo EXIT' EXIT; echo "g: $*"; )
h() {
echo "h: $*";
f "$@";
g "$@";
}
echo 0
{ echo 1; }
( echo 2 )
echo 3 &
{ echo 4; } &
echo 5 | cat | tee
for (( kk=6; kk<10; kk++ )); do
echo $kk
h $kk
for jj in {1..3}; do
f $kk $jj
g $kk $jj
done
done
}
timep testfunc
gives
LINE DEPTH CMD COMBINED WALL-CLOCK TIME COMBINED CPU TIME COMMAND
line.depth.cmd: ( time | total % | cur depth % ) ( time | total % | cur depth % ) (count) <command>
_______________ __________________________________________________________________ ____________________________________
1.0.0: ( 0.565324s |100.00% ) ( 0.572306s |100.00% ) (1x) << (FUNCTION): main.testfunc "${@}" >>
1.1.0: ( 0.000070s | 0.01% ) ( 0.000093s | 0.01% ) (1x) testfunc "${@}"
2.1.0: ( 0.022847s | 4.04% ) ( 0.022775s | 3.97% ) (1x) trap 'echo RETURN' RETURN
5.1.0: ( 0.002009s | 0.35% ) ( 0.002016s | 0.35% ) (1x) TRAP (RETURN): echo RETURN
11.1.0: ( 0.000095s | 0.01% ) ( 0.000108s | 0.01% ) (1x) echo 0
12.1.0: ( 0.000770s | 0.13% ) ( 0.000670s | 0.11% ) (1x) echo 1
13.1.0: ( 0.000176s | 0.03% ) ( 0.000203s | 0.03% ) (1x) << (SUBSHELL) >>
13.2.0: ( 0.000176s | 0.03% |100.00% ) ( 0.000203s | 0.03% |100.00% ) (1x) └─echo 2
14.1.0: ( 0.000480s | 0.08% ) ( 0.000512s | 0.08% ) (1x) echo 3 (&)
15.1.0: ( 0.000162s | 0.02% ) ( 0.000188s | 0.03% ) (1x) << (BACKGROUND FORK) >>
15.2.0: ( 0.000162s | 0.02% |100.00% ) ( 0.000188s | 0.03% |100.00% ) (1x) └─echo 4
16.1.0: ( 0.004038s | 0.71% ) ( 0.013263s | 2.31% ) (1x) echo 5 | cat | tee
18.1.0: ( 0.000070s | 0.01% ) ( 0.000084s | 0.01% ) (1x) ((kk=6))
18.1.0: ( 0.000282s | 0.04% | 0.01% ) ( 0.000327s | 0.05% | 0.01% ) (4x) ((kk++ ))
18.1.1: ( 0.000365s | 0.06% | 0.01% ) ( 0.000434s | 0.07% | 0.01% ) (5x) ((kk<10))
19.1.0: ( 0.000289s | 0.05% | 0.01% ) ( 0.000346s | 0.06% | 0.01% ) (4x) echo $kk
20.1.0: ( 0.144182s | 25.50% | 6.37% ) ( 0.143671s | 25.10% | 6.27% ) (4x) << (FUNCTION): main.testfunc.h $kk >>
1.2.0: ( 0.0003
Related Skills
node-connect
339.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
339.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.9kCommit, push, and open a PR
