PotCrackSeg

[TIV 2024] Segmentation of Road Negative Obstacles Based on Dual Semantic-Feature Complementary Fusion for Autonomous Driving

Generate Convert Improve

Install / Use

/learn @lab-sun/PotCrackSeg

About this skill

Quality Score

0/100

README

PotCrackSeg

The official pytorch implementation of Segmentation of Road Negative Obstacles Based on Dual Semantic-feature Complementary Fusion for Autonomous Driving. (TIV)

We test our code in Python 3.8, CUDA 11.3, cuDNN 8, and PyTorch 1.12.1. We provide Dockerfile to build the docker image we used. You can modify the Dockerfile as you want.

Demo

The accompanied video can be found at:

Introduction

PotCrackSeg with an RGB-Depth fusion network with a dual semantic-feature complementary fusion module for the segmentation of potholes and cracks in traffic scenes.

Dataset

The NPO++ dataset is upgraded from the existing NPO dataset by re-labeling potholes and cracks. You can downloaded NPO++ dataset from here

Pretrained weights

The pretrained weight of PotCrackSeg can be downloaded from here.

Usage

Clone this repo

$ git clone https://github.com/lab-sun/PotCrackSeg.git

Build docker image

$ cd ~/PotCrackSeg
$ docker build -t docker_image_PotCrackSeg .

Download the dataset

$ (You should be in the PotCrackSeg folder)
$ mkdir ./NPO++
$ cd ./NPO++
$ (download our preprocessed NPO++.zip in this folder)
$ unzip -d . NPO++.zip

To reproduce our results, you need to download our pretrained weights.

$ (You should be in the PotCrackSeg folder)
$ mkdir ./weights_backup
$ cd ./weights_backup
$ (download our preprocessed weights_backup.zip in this folder)
$ unzip -d . weights_backup.zip
$ docker run -it --shm-size 8G -p 1234:6006 --name docker_container_potcrackseg --gpus all -v ~/PotCrackSeg:/workspace docker_image_potcrackseg
$ (currently, you should be in the docker)
$ cd /workspace
$ (To reproduce the results)
$ python3 run_demo.py

The results will be saved in the ./runs folder. The default results are PotCrackSeg-4B. If you want to reproduce the results of PotCrackSeg-2B, you can modify the PotCrackSeg-4B to PotCrackSeg-2B in run_demo.py

To train PotCrackSeg.

$ (You should be in the PotCrackSeg folder)
$ docker run -it --shm-size 8G -p 1234:6006 --name docker_container_potcrackseg --gpus all -v ~/PotCrackSeg:/workspace docker_image_potcrackseg
$ (currently, you should be in the docker)
$ cd /workspace
$ python3 train.py

To see the training process

$ (fire up another terminal)
$ docker exec -it docker_container_potcrackseg /bin/bash
$ cd /workspace
$ tensorboard --bind_all --logdir=./runs/tensorboard_log/
$ (fire up your favorite browser with http://localhost:1234, you will see the tensorboard)

The results will be saved in the ./runs folder. Note: Please change the smoothing factor in the Tensorboard webpage to 0.999, otherwise, you may not find the patterns from the noisy plots. If you have the error docker: Error response from daemon: could not select device driver, please first install NVIDIA Container Toolkit on your computer!

Citation

If you use PotCrackSeg in your work, please cite:

@ARTICLE{feng2024segmentation,
  author={Zhen Feng and Yanning Guo and Yuxiang Sun},
  journal={IEEE Transactions on Intelligent Vehicles}, 
  title={Segmentation of Road Negative Obstacles Based on Dual Semantic-Feature Complementary Fusion for Autonomous Driving}, 
  year={2024},
  volume={9},
  number={4},
  pages={4687-4697},
  doi={10.1109/TIV.2024.3376534}}

Acknowledgement

Some of the codes are borrowed from IGFNet.

Related Skills

node-connect

351.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

110.7k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

351.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

351.4k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。