HybridNets: End2End Perception Network

logo
HybridNets Network Architecture.

<br>

</div>

HybridNets: End-to-End Perception Network

by Dat Vu, Bao Ngo, Hung Phan<sup> :email:</sup> FPT University

(<sup>:email:</sup>) corresponding author.

arXiv technical report (arXiv 2203.09035)

<details> <summary>Table of Contents</summary> <ol> <li> <a href="#about-the-project">About The Project</a> <ul> <li><a href="#project-structure">Project Structure</a></li> </ul> </li> <li> <a href="#getting-started">Getting Started</a> <ul> <li><a href="#installation">Installation</a></li> <li><a href="#demo">Demo</a></li> </ul> </li> <li> <a href="#usage">Usage</a> <ul> <li><a href="#data-preparation">Data Preparation</a></li> <li><a href="#training">Training</a></li> </ul> </li> <li><a href="#training-tips">Training Tips</a></li> <li><a href="#results">Results</a></li> <li><a href="#license">License</a></li> <li><a href="#acknowledgements">Acknowledgements</a></li> <li><a href="#citation">Citation</a></li> </ol> </details>

About The Project

HybridNets is an end2end perception network for multi-tasks. Our work focused on traffic object detection, drivable area segmentation and lane detection. HybridNets can run real-time on embedded systems, and obtains SOTA Object Detection, Lane Detection on BDD100K Dataset. intro

Project Structure

HybridNets
│   backbone.py                   # Model configuration
|   export.py                     # UPDATED 10/2022: onnx weight with accompanying .npy anchors
│   hubconf.py                    # Pytorch Hub entrypoint
│   hybridnets_test.py            # Image inference
│   hybridnets_test_videos.py     # Video inference
│   train.py                      # Train script
│   train_ddp.py                  # DistributedDataParallel training (Multi GPUs)
│   val.py                        # Validate script
│   val_ddp.py                    # DistributedDataParralel validating (Multi GPUs)
│
├───encoders                      # https://github.com/qubvel/segmentation_models.pytorch/tree/master/segmentation_models_pytorch/encoders
│       ...
│
├───hybridnets
│       autoanchor.py             # Generate new anchors by k-means
│       dataset.py                # BDD100K dataset
│       loss.py                   # Focal, tversky (dice)
│       model.py                  # Model blocks
│
├───projects
│       bdd100k.yml               # Project configuration
│
├───ros                           # C++ ROS Package for path planning
│       ...
│
└───utils
    |   constants.py
    │   plot.py                   # Draw bounding box
    │   smp_metrics.py            # https://github.com/qubvel/segmentation_models.pytorch/blob/master/segmentation_models_pytorch/metrics/functional.py
    │   utils.py                  # Various helper functions (preprocess, postprocess, eval...)

Getting Started

Installation

The project was developed with Python>=3.7 and Pytorch>=1.10.

git clone https://github.com/datvuthanh/HybridNets
cd HybridNets
pip install -r requirements.txt

Demo

# Download end-to-end weights
curl --create-dirs -L -o weights/hybridnets.pth https://github.com/datvuthanh/HybridNets/releases/download/v1.0/hybridnets.pth

# Image inference
python hybridnets_test.py -w weights/hybridnets.pth --source demo/image --output demo_result --imshow False --imwrite True

# Video inference
python hybridnets_test_videos.py -w weights/hybridnets.pth --source demo/video --output demo_result

# Result is saved in a new folder called demo_result

Usage

Data Preparation

Recommended dataset structure:

HybridNets
└───datasets
    ├───imgs
    │   ├───train
    │   └───val
    ├───det_annot
    │   ├───train
    │   └───val
    ├───da_seg_annot
    │   ├───train
    │   └───val
    └───ll_seg_annot
        ├───train
        └───val

Update your dataset paths in projects/your_project_name.yml.

For BDD100K:

Training

1) Edit or create a new project configuration, using bdd100k.yml as a template. Augmentation params are here.

# mean and std of dataset in RGB order
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]

# bdd100k anchors
anchors_scales: '[2**0, 2**0.70, 2**1.32]'
anchors_ratios: '[(0.62, 1.58), (1.0, 1.0), (1.58, 0.62)]'

# BDD100K officially supports 10 classes
# obj_list: ['person', 'rider', 'car', 'truck', 'bus', 'train', 'motorcycle', 'bicycle', 'traffic light', 'traffic sign']
obj_list: ['car']
obj_combine: ['car', 'bus', 'truck', 'train']  # if single class, combine these classes into 1 single class in obj_list
                                               # leave as empty list ([]) to not combine classes

seg_list: ['road',
          'lane']
seg_multilabel: false  # a pixel can belong to multiple labels (i.e. lane line + underlying road)

dataset:
  dataroot: path/to/imgs
  labelroot: path/to/det_annot
  segroot:
  # must be in correct order with seg_list
  - path/to/da_seg_annot
  - path/to/ll_seg_annot
  fliplr: 0.5
  flipud: 0.0
  hsv_h: 0.015
  hsv_s: 0.7
  hsv_v: 0.4
...

2) Train

python train.py -p bdd100k        # your_project_name
                -c 3              # coefficient of effnet backbone, result from paper is 3
            OR  -bb repvgg_b0     # change your backbone with timm
                -n 4              # num_workers
                -b 8              # batch_size per gpu
                -w path/to/weight # use 'last' to resume training from previous session
                --freeze_det      # freeze detection head, others: --freeze_backbone, --freeze_seg
                --lr 1e-5         # learning rate
                --optim adamw     # adamw | sgd
                --num_epochs 200

Please check python train.py --help for cheat codes.

~~IMPORTANT~~ (deprecated): If you want to train on multiple gpus, use train_ddp.py. Tested on NVIDIA DGX with 8xA100 40GB.
Why didn't we combine DDP into the already existing train.py script?

Lots of if-else.
Don't want to break functioning stuffs.
Lazy.

Update 24/06/2022: train_ddp.py broke because we have a lot of things changed. Therefore, we decided to write a merged train.py with DDP support for easier maintainance. In the meantime, please clone this commit with a working train_ddp.py script if you really have to.

3) Evaluate

python val.py -w checkpoints/weight.pth

Again, check python val.py --help for god mode.

Validation process got killed! What do I do? => This is because we use a default confidence threshold of 0.001 to compare with other networks. So when calculating metrics, it has to handle a large amount of bounding boxes, leading to out-of-memory, and finally exploding the program before the next epoch.

That being said, there are multiple ways to circumvent this problem, choose the best that suit you:

Train on a high-RAM instance (RAM as in main memory, not VRAM in GPU). For your reference, we can only val the combined car class with 64GB RAM.
Train with python train.py --cal_map False to not calculate metrics when validating. This option will only print validation losses. When the losses seem to flatten and the weather is nice, rent a high-RAM instance to validate the best weight with python val.py -w checkpoints/xxx_best.pth. We actually did this to save on cost.
Reduce the confidence threshold with python train.py --conf_thres 0.5 or python val.py --conf_thres 0.5,

HybridNets

Install / Use

README