Cupti
Profile how CUDA applications create and modify data in memory.
Install / Use
/learn @cwpearson/CuptiREADME
CUPTI
Setup
Install some dependencies
sudo apt install libnuma-dev libboost-all-dev
Install CUDA and CUDNN.
...
Modify env.sh to point to the right libraries.
Build the profiling library (prof.so).
make
Run on a CUDA application
Make sure your CUDA application is not statically-linked, which is the default when you are building your own CUDA code.
This will record data by appending to an output.cprof file, so usually remove that file first. ./env.sh sets up the LD_PRELOAD environment and invokes your app.
rm -f output.cprof
./env.sh <your app>
Do something with the result:
cprof2<something>.py
Other info
env.sh sets LD_PRELOAD to load the profiling library and its dependences.
Related Skills
node-connect
354.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
112.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
354.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
354.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
