VisionSense

VisionSense is a comprehensive ROS2-based computer vision system designed for autonomous vehicles running on NVIDIA Jetson platforms with JetPack 6.2. It provides a complete perception pipeline with real-time object detection, lane detection, traffic sign recognition, stereo depth estimation, and driver monitoring capabilities.

Generate Convert Improve

Install / Use

/learn @connected-wise/VisionSense

About this skill

Quality Score

0/100

README

VisionSense

<img src="assets/Logo_Symbol_Dark.png" alt="VisionSense Logo" width="150"/> Advanced Autonomous Vehicle Perception System Real-time perception powered by TensorRT on NVIDIA Jetson <a href="#features">Features</a> • <a href="#system-architecture">Architecture</a> • <a href="#installation">Installation</a> • <a href="#usage">Usage</a> • <a href="#nodes">Nodes</a>

https://github.com/user-attachments/assets/8b5bfc2b-9bf6-4562-895b-04ba0c5b41e3

Overview

Features

| Feature | Description | Model/Method | |---------|-------------|--------------| | Object Detection | Detect vehicles, pedestrians, cyclists, traffic signs/lights | YOLOv8 + TensorRT | | Multi-Object Tracking | Track objects across frames with unique IDs | BYTE Tracker + Kalman Filter | | Lane Detection | Segment and detect lane lines | Neural Network + TensorRT | | Traffic Sign Recognition | Classify 50+ traffic sign types | YOLOv8 Classifier + TensorRT | | Stereo Depth Estimation | Dense depth maps from stereo camera | LightStereo + TensorRT | | Driver Monitoring | Face detection and gaze estimation | YOLOv11 + ResNet18 + TensorRT | | Data Fusion GUI | Real-time visualization of all perception data | OpenCV + X11 | | Web Dashboard | Remote monitoring interface | HTTP Server |

System Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                            VisionSense Architecture                          │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌──────────────┐     ┌──────────────┐     ┌──────────────┐                │
│   │ Mono Camera  │     │Stereo Camera │     │   IMU/GPS    │                │
│   │  (CSI/USB)   │     │  (Arducam)   │     │   Module     │                │
│   └──────┬───────┘     └──────┬───────┘     └──────┬───────┘                │
│          │                    │                    │                         │
│          ▼                    ▼                    ▼                         │
│   ┌──────────────┐     ┌──────────────┐     ┌──────────────┐                │
│   │    camera    │     │ camera_stereo│     │   imu_gps    │                │
│   │     node     │     │     node     │     │     node     │                │
│   └──────┬───────┘     └──────┬───────┘     └──────┬───────┘                │
│          │                    │                    │                         │
│          ▼                    ├────────┬───────────┘                         │
│   ┌──────────────┐            │        │                                     │
│   │   driver     │            ▼        ▼                                     │
│   │   monitor    │     ┌─────────┐ ┌─────────┐                               │
│   └──────┬───────┘     │ detect  │ │ stereo  │                               │
│          │             │  node   │ │  depth  │                               │
│          │             └────┬────┘ └────┬────┘                               │
│          │                  │           │                                    │
│          │             ┌────┴────┐      │                                    │
│          │             ▼         ▼      │                                    │
│          │      ┌─────────┐ ┌─────────┐ │                                    │
│          │      │classify │ │ lanedet │ │                                    │
│          │      │  node   │ │  node   │ │                                    │
│          │      └────┬────┘ └────┬────┘ │                                    │
│          │           │           │      │                                    │
│          │           └─────┬─────┘      │                                    │
│          │                 │            │                                    │
│          │                 ▼            │                                    │
│          │          ┌──────────┐        │                                    │
│          │          │   adas   │        │                                    │
│          │          │   node   │        │                                    │
│          │          └────┬─────┘        │                                    │
│          │               │              │                                    │
│          └───────────────┼──────────────┘                                    │
│                          ▼                                                   │
│                   ┌──────────────┐     ┌──────────────┐                      │
│                   │     GUI      │     │  Dashboard   │                      │
│                   │  (Display)   │     │    (Web)     │                      │
│                   └──────────────┘     └──────────────┘                      │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

System Requirements

| Component | Requirement | |-----------|-------------| | Hardware | NVIDIA Jetson Orin Nano/NX/AGX | | OS | Ubuntu 22.04 (JetPack 6.2) | | ROS2 | Humble Hawksbill | | CUDA | 12.6+ | | TensorRT | 10.x | | OpenCV | 4.x with CUDA support |

Nodes

1. Camera Node (`camera`)

Captures video from mono cameras (CSI or USB) for driver monitoring.

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | resource | string | csi://0 | Camera source URI | | width | int | 1280 | Frame width | | height | int | 720 | Frame height |

Topics Published:

/camera/raw (sensor_msgs/Image) - Raw camera frames

Supported Sources:

CSI Camera: csi://0
USB Camera: v4l2:///dev/video0
Video File: file:///path/to/video.mp4

2. Stereo Camera Node (`camera_stereo`)

Handles Arducam stereo camera with synchronized left/right image capture and CUDA-accelerated rotation.

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | resource | string | /dev/video1 | V4L2 device path | | width | int | 3840 | Full stereo width (1920×2) | | height | int | 1200 | Stereo height | | framerate | int | 30 | Capture framerate | | rotated_lenses | bool | false | Apply 90° rotation to each eye | | cuda_flip | string | rotate-180 | CUDA flip mode: rotate-180, vertical-flip, horizontal-flip, or empty for none |

Topics Published:

/camera_stereo/left/image_raw (sensor_msgs/Image) - Left camera (1200×1200)
/camera_stereo/right/image_raw (sensor_msgs/Image) - Right camera (1200×1200)

CUDA Kernels:

Left eye: 90° counter-clockwise rotation
Right eye: 90° clockwise rotation

3. Stereo Depth Node (`stereo_depth`)

Computes dense depth maps using LightStereo neural network with TensorRT acceleration.

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | model | string | LightStereo-S-KITTI.engine | TensorRT engine path | | max_disparity | float | 192.0 | Maximum disparity value | | warmup_iterations | int | 5 | Model warmup runs |

Topics Subscribed:

left/image_raw (sensor_msgs/Image) - Left stereo image
right/image_raw (sensor_msgs/Image) - Right stereo image

Topics Published:

/stereo_depth/disparity (sensor_msgs/Image) - Normalized disparity (mono8, 0-255)

Model Specifications:

Input: Stereo pair (preprocessed with aspect-preserving resize and RightTopPad)
Output: Dense disparity map (resized back to input dimensions)
Architecture: LightStereo-S (KITTI trained)

4. Object Detection Node (`detect`)

Real-time object detection using YOLOv8 with TensorRT and multi-object tracking.

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | model | string | detect.engine | TensorRT engine path | | labels | string | labels_detect.txt | Class labels file | | thresholds | float[] | [0.40, 0.45, ...] | Per-class confidence thresholds | | track_frame_rate | int | 30 | Tracking frame rate | | track_buffer | int | 30 | Lost track buffer size |

Detected Classes: | ID | Class | Threshold | |----|-------|-----------| | 0 | Pedestrian | 0.45 | | 1 | Cyclist | 0.45 | | 2 | Vehicle-Car | 0.60 | | 3 | Vehicle-Bus | 0.45 | | 4 | Vehicle-Truck | 0.45 | | 5 | Train | 0.50 | | 6 | Traffic Light | 0.40 | | 7 | Traffic Sign | 0.55 |

Topics Subscribed:

/detect/image_in (sensor_msgs/Image) - Input image

Topics Published:

/detect/detections (visionconnect/Detect) - Detection results with tracking
/detect/signs (visionconnect/Signs) - Cropped traffic signs for classification

Tracking Features:

BYTE tracker with Kalman filter prediction
Unique ID assignment per tracked object
ID format: {ClassName}_{ID} (e.g., Car_001, Pedestrian_003)

5. Traffic Sign Classification Node (`classify`)

Classifies detected traffic signs and lights into 50+ categories.

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | model | string | classify.engine | TensorRT engine path | | labels | string | labels_classify.txt | Class labels file | | thresholds | float[] | [0.30, 0.75] | Traffic light/sign thresholds |

Supported Sign Categories:

Traffic Lights: Red, Yellow, Green
Regulatory Signs: Stop, Yield, Speed Limits (15-70 mph), No Entry, No

Related Skills

tmux

345.4k

Remote-control tmux sessions for interactive CLIs by sending keystrokes and scraping pane output.

diffs

345.4k

Use the diffs tool to produce real, shareable diffs (viewer URL, file artifact, or both) instead of manual edit summaries.

terraform-provider-genesyscloud

Terraform Provider Genesyscloud

blogwatcher

345.4k

Monitor blogs and RSS/Atom feeds for updates using the blogwatcher CLI.

connected-wise

View profile

View on GitHub

GitHub Stars5

CategoryOperations

Updated8h ago

Forks0

connected-wise/VisionSense

Languages

C++

Security Score

75/100

Audited on Apr 2, 2026

No findings

VisionSense

Install / Use

README

VisionSense

Overview

Features

System Architecture

System Requirements

Nodes

1. Camera Node (camera)

2. Stereo Camera Node (camera_stereo)

3. Stereo Depth Node (stereo_depth)

4. Object Detection Node (detect)

5. Traffic Sign Classification Node (classify)

Related Skills

1. Camera Node (`camera`)

2. Stereo Camera Node (`camera_stereo`)

3. Stereo Depth Node (`stereo_depth`)

4. Object Detection Node (`detect`)

5. Traffic Sign Classification Node (`classify`)