NVTX (NVIDIA Tools Extension Library)

NVTX is a cross-platform API for annotating source code to provide contextual information to developer tools.

The NVTX API is written in C, with wrappers provided for C++ and Python.

| C Docs | C++ Docs | Python Docs | | --- | --- | --- |

What does NVTX do?

By default, NVTX API calls do nothing. When you launch a program from a developer tool, NVTX calls in that program are redirected to functions in the tool. Developer tools are free to implement NVTX API calls however they wish.

Here are some examples of what a tool might do with NVTX calls:

Print a message to the console
Record a trace of when NVTX calls occur, and display them on a timeline
Build a statistical profile of NVTX calls, or time spent in ranges between calls
Enable/disable tool features in ranges bounded by NVTX calls matching some criteria
Forward the data to other logging APIs or event systems

Example: Visualize loop iterations on a timeline

This C++ example annotates some_function with an NVTX range using the function's name. This range begins at the top of the function body, and automatically ends when the function returns. The function performs a loop, sleeping for one second in each iteration. A local nvtx3::scoped_range annotates the scope of the loop body. The loop iteration ranges are nested within the function range.

#include <nvtx3/nvtx3.hpp>

void some_function()
{
    NVTX3_FUNC_RANGE();  // Range around the whole function

    for (int i = 0; i < 6; ++i) {
        nvtx3::scoped_range loop{"loop range"};  // Range for iteration

        // Make each iteration last for one second
        std::this_thread::sleep_for(std::chrono::seconds{1});
    }
}

Normally, this program waits for 6 seconds, and does nothing else.

Launch it from NVIDIA Nsight Systems, and you'll see this execution on a timeline:

Example NVTX Ranges in Nsight Systems

The NVTX row shows the function's name "some_function" in the top-level range and the "loop range" message in the nested ranges. The loop iterations each last for the expected one second.

What kinds of annotation does NVTX provide?

Markers

Markers annotate a specific point in a program's execution with a message. Optional extra fields may be provided: a category, a color, and a payload value.

Ranges

Ranges annotate a range between two points in a program's execution, like a related pair of markers. There are two types of ranges:

Push/Pop ranges, which can be nested to form a stack
- The Pop call is automatically associated with a prior Push call on the same thread
Start/End ranges, which may overlap with other ranges arbitrarily
- The Start call returns a handle which must be passed to the End call
- These ranges can start and end on different threads

The C++ and Python interfaces provide objects and decorators for automatically managing the lifetimes of ranges.

Resource naming/tracking

Resource naming associates a displayable name string with an object. For example, naming CPU threads allows a tool that displays thread activity on a timeline to have more meaningful labels for its rows than a numeric thread ID.

Resource tracking extends the idea of naming to include object lifetime tracking, as well as important usage of the object. For example, a mutex provided by platform API (e.g. pthread_mutex, CriticalSection) can be tracked by a tool that intercepts its lock/unlock API calls, so using NVTX to name these mutex objects would be sufficient to see the names of mutexes being locked/unlocked on a timeline. However, manually implemented spin-locks may not have an interceptible API, so tools can't automatically detect when they are used. Use NVTX to annotate these types of mutexes where they are locked/unlocked to enable tools to track them just like standard platform API mutexes.

How do I use NVTX in my code?

C and C++

For C and C++, NVTX is a header-only library with no dependencies. Simply #include the header(s) you want to use, and call NVTX functions! NVTX initializes automatically during the first call to any NVTX function.

It is not necessary to link against a binary library. On POSIX platforms, adding the -ldl option to the linker command-line is required.

NOTE: Older versions of NVTX did require linking against a dynamic library. NVTX version 3 provides the same API, but removes the need to link with any library. Ensure you are including NVTX v3 by using the nvtx3 directory as a prefix in your #includes:

#include <nvtx3/nvToolsExt.h>

void example()
{
    nvtxMark("Hello world!");
}

C++:

#include <nvtx3/nvtx3.hpp>

void example()
{
    nvtx3::mark("Hello world!");
}

On Windows, be aware that NVTX includes windows.h. Without intervention, this will define tokens in the global namespace such as min, max, and small. Take care to #define any desired macros such as WIN32_LEAN_AND_MEAN, NOMINMAX, etc. before including any NVTX headers.

The NVTX C++ API is a set of wrappers around the C API, so the C API functions are usable from C++ as well.

Since the C and C++ APIs are header-only, dependency-free, and don't require explicit initialization, they are suitable for annotating other header-only libraries.

See more details in the c directory of this repo, and in the API reference guides:

CMake

For projects that use CMake, the CMake scripts included with NVTX provide targets nvtx3-c and nvtx3-cpp. Use target_link_libraries to make any CMake target use nvtx3-c for the C API only and nvtx3-cpp for both the C and C++ APIs. Since NVTX is a header-only library, these targets simply add the include search path for the NVTX headers and add the -ldl linker option where required. Example usage:

# Example C program
add_executable(some_c_program main.c)
target_link_libraries(some_c_program PRIVATE nvtx3-c)
# main.c can now do #include <nvtx3/nvToolsExt.h>

# Example C++ program
add_executable(some_cpp_program main.cpp)
target_link_libraries(some_cpp_program PRIVATE nvtx3-cpp)
# main.cpp can now do #include <nvtx3/nvtx3.hpp>

NVTX provides two different ways to define the CMake targets:

Normal CMake targets (non-IMPORTED)

Non-IMPORTED targets are global to the entire build. In a typical CMake codebase, add_subdirectory is used to include every directory in a source tree, where each contains a CMakeLists.txt file that defines targets usable anywhere in the build. The NVTX CMakeLists.txt file defines normal (non-IMPORTED) targets when add_subdirectory is called on that directory.

This example code layout has a few imported third-party libraries and a separate directory for its own source. It shows that adding the NVTX directory to CMake allows the nvtx3-cpp to be used elsewhere in the source tree:

CMakeLists.txt

add_subdirectory(Imports)
add_subdirectory(Source)

Imports/
- CMakeLists.txt
```
add_subdirectory(SomeLibrary)
add_subdirectory(NVTX)
add_subdirectory(SomeOtherLibrary)
```
- SomeLibrary/
- NVTX/ (this is the downloaded copy of NVTX)
  - CMakeLists.txt (defines nvtx3-c and nvtx3-cpp targets)
  - nvtxImportedTargets.cmake (helper script)
  - include/
    - nvtx3/ (all NVTX headers)
- SomeOtherLibrary/

Source/

CMakeLists.txt

add_executable(my_program main.cpp)
target_link_libraries(my_program PRIVATE nvtx3-cpp)

main.cpp (does #include <nvtx3/nvtx3.hpp>)

Another example is when the NVTX directory must be added with a relative path that is not a subdirectory. In this case, CMake requires a second parameter to add_subdirectory to give a unique name for the directory where build output goes:

Utils/
- SomeLibrary/
- NVTX/ (this is the downloaded copy of NVTX)
  - CMakeLists.txt (defines nvtx3-c and nvtx3-cpp targets)
  - nvtxImportedTargets.cmake (helper script)
  - include/
    - nvtx3/ (all NVTX headers)
- SomeOtherLibrary/
Project1/
Project2/

Project3/

CMakeLists.txt

add_subdirectory("${CMAKE_CURRENT_LIST_DIR}/../Utils/NVTX" "ImportNVTX")

add_executable(my_program main.cpp)
target_link_libraries(my_program PRIVATE nvtx3-cpp)

main.cpp (does #include <nvtx3/nvtx3.hpp>)

When defining normal (non-IMPORTED) targets, the NVTX CMake scripts avoid target-already-defined errors by checking if the targets exist before attempting to define them. This enables the following scenarios:

The same NVTX directory can be added more than once
Multiple directories with copies of the same NVTX version can be added
Multiple directories different versions of NVTX can be added
- If newest version is added first, everything should work:
  - The nvtx3-c/nvtx3-cpp targets will point to the newest version
- If a new version is added after an old version:
  - The nvtx3-c/nvtx3-cpp targets will point to an old version
  - If features of the newest version are used, compilation will fail
  - The NVTX CMake scripts print a warning for this case

Normal (non-IMPORTED) targets will be defined when using CPM (CMake Package Manager) to fetch NVTX directly from the internet. Thus, NVTX targets defined via CPM follow the behavior described ab

NVTX

Install / Use

README

NVTX (NVIDIA Tools Extension Library)

What does NVTX do?

Example: Visualize loop iterations on a timeline

What kinds of annotation does NVTX provide?

Markers

Ranges

Resource naming/tracking

How do I use NVTX in my code?

C and C++

CMake

Normal CMake targets (non-IMPORTED)