Tensorrtx

Implementation of popular deep learning networks with TensorRT network definition API

Generate Convert Improve

Install / Use

/learn @wang-xinyu/Tensorrtx

About this skill

Quality Score

0/100

README

TensorRTx

TensorRTx aims to implement popular deep learning networks with TensorRT network definition API.

Why don't we use a parser (ONNX parser, UFF parser, caffe parser, etc), but use complex APIs to build a network from scratch? I have summarized the advantages in the following aspects.

Flexible, easy to modify the network, add/delete a layer or input/output tensor, replace a layer, merge layers, integrate preprocessing and postprocessing into network, etc.
Debuggable, construct the entire network in an incremental development manner, easy to get middle layer results.
Educational, learn about the network structure during this development, rather than treating everything as a black box.

The basic workflow of TensorRTx is:

Get the trained models from pytorch, mxnet or tensorflow, etc. Some pytorch models can be found in my repo pytorchx, the remaining are from popular open-source repos.
Export the weights to a plain text file -- .wts file.
Load weights in TensorRT, define the network, build a TensorRT engine.
Load the TensorRT engine and run inference.

News

3 Mar 2026. zgjja Add Vision Transformer
2 Feb 2026. fazligorkembal Yolo26-Det, Yolo26-Obb, Yolo26-Cls
15 Jan 2026. zgjja Refactor multiple old CV models to support TensorRT SDK through 7~10.
8 Jan 2026. ydk61: YOLOv13
10 May 2025. pranavm-nvidia: YOLO11 writen in Tripy.
2 May 2025. fazligorkembal: YOLO12
12 Apr 2025. pranavm-nvidia: First Lenet example writen in Tripy.
11 Apr 2025. mpj1234: YOLO11-obb
22 Oct 2024. lindsayshuo: YOLOv8-obb
18 Oct 2024. zgjja: Refactor docker image.
11 Oct 2024. mpj1234: YOLO11
9 Oct 2024. Phoenix8215: GhostNet V1 and V2.
21 Aug 2024. Lemonononon: real-esrgan-general-x4v3
29 Jul 2024. mpj1234: Check the YOLOv5, YOLOv8 & YOLOv10 in TensorRT 10.x API, branch → trt10
29 Jul 2024. mpj1234: YOLOv10
21 Jun 2024. WuxinrongY: YOLOv9-T, YOLOv9-S, YOLOv9-M
28 Apr 2024. lindsayshuo: YOLOv8-pose
22 Apr 2024. B1SH0PP: EfficientAd: Accurate Visual Anomaly Detection at Millisecond-Level Latencies.
18 Apr 2024. lindsayshuo: YOLOv8-p2

Tutorials

Test Environment

(NOT recommended) TensorRT 7.x
(Recommended)TensorRT 8.x
(NOT recommended) TensorRT 10.x

Note

For history reason, some of the models are limited to specific TensorRT version, please check the README.md or code for the model you want to use.
Currently, TensorRT 8.x has better compatibility and the most of the features supported.

How to run

Note: this project support to build each network by the CMakeLists.txt in its subfolder, or you can build them together by the CMakeLists.txt on top of this project.

General procedures before building and running:

# 1. generate xxx.wts from https://github.com/wang-xinyu/pytorchx/tree/master/lenet
# ...

# 2. put xxx.wts on top of this folder
# ...

(Option 1) To build a single subproject in this project, do:

## enter the subfolder
cd tensorrtx/xxx

## configure & build
cmake -S . -B build
make -C build

(Option 2) To build many subprojects, firstly, in the top CMakeLists.txt, uncomment the project you don't want to build or not suppoted by your TensorRT version, e.g., you cannot build subprojects in ${TensorRT_8_Targets} if your TensorRT is 7.x. Then:

## enter the top of this project
cd tensorrtx

## configure & build
# you may use "Ninja" rather than "make" to significantly boost the build speed
cmake -G Ninja -S . -B build
ninja -C build

WARNING: This part is still under development, most subprojects are not adapted yet.

run the generated executable, e.g.:

# serialize model to plan file i.e. 'xxx.engine'
build/xxx -s

# deserialize plan file and run inference
build/xxx -d

# (Optional) check if the output is same as pytorchx/lenet
# ...

# (Optional) customize the project
# ...

For more details, each subfolder may contain a README.md inside, which explains more.

Models

Following models are implemented.

| Name | Description | | ---------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | mlp | the very basic model for starters, properly documented | | lenet | the simplest, as a "hello world" of this project | | alexnet | easy to implement, all layers are supported in tensorrt | | googlenet | GoogLeNet (Inception v1) | | inception | Inception v3, v4 | | mnasnet | MNASNet with depth multiplier of 0.5 from the paper | | mobilenet | MobileNet v2, v3-small, v3-large | | resnet | resnet-18, resnet-50 and resnext50-32x4d are implemented | | senet | se-resnet50 | | shufflenet | ShuffleNet v2 with 0.5x output channels | | squeezenet | SqueezeNet 1.1 model

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

399

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

workshop-rules

Materials used to teach the summer camp <Data Science for Kids>