HybridNets
HybridNets: End-to-End Perception Network
Install / Use
/learn @datvuthanh/HybridNetsREADME
HybridNets: End2End Perception Network
<div align="center">
HybridNets Network Architecture.
<!-- TABLE OF CONTENTS --> <details> <summary>Table of Contents</summary> <ol> <li> <a href="#about-the-project">About The Project</a> <ul> <li><a href="#project-structure">Project Structure</a></li> </ul> </li> <li> <a href="#getting-started">Getting Started</a> <ul> <li><a href="#installation">Installation</a></li> <li><a href="#demo">Demo</a></li> </ul> </li> <li> <a href="#usage">Usage</a> <ul> <li><a href="#data-preparation">Data Preparation</a></li> <li><a href="#training">Training</a></li> </ul> </li> <li><a href="#training-tips">Training Tips</a></li> <li><a href="#results">Results</a></li> <li><a href="#license">License</a></li> <li><a href="#acknowledgements">Acknowledgements</a></li> <li><a href="#citation">Citation</a></li> </ol> </details>HybridNets: End-to-End Perception Network
by Dat Vu, Bao Ngo, Hung Phan<sup> :email:</sup> FPT University
(<sup>:email:</sup>) corresponding author.
arXiv technical report (arXiv 2203.09035)
About The Project
<!-- #### <div align=center> **HybridNets** = **real-time** :stopwatch: * **state-of-the-art** :1st_place_medal: * (traffic object detection + drivable area segmentation + lane line detection) :motorway: </div> -->HybridNets is an end2end perception network for multi-tasks. Our work focused on traffic object detection, drivable area segmentation and lane detection. HybridNets can run real-time on embedded systems, and obtains SOTA Object Detection, Lane Detection on BDD100K Dataset.

Project Structure
HybridNets
│ backbone.py # Model configuration
| export.py # UPDATED 10/2022: onnx weight with accompanying .npy anchors
│ hubconf.py # Pytorch Hub entrypoint
│ hybridnets_test.py # Image inference
│ hybridnets_test_videos.py # Video inference
│ train.py # Train script
│ train_ddp.py # DistributedDataParallel training (Multi GPUs)
│ val.py # Validate script
│ val_ddp.py # DistributedDataParralel validating (Multi GPUs)
│
├───encoders # https://github.com/qubvel/segmentation_models.pytorch/tree/master/segmentation_models_pytorch/encoders
│ ...
│
├───hybridnets
│ autoanchor.py # Generate new anchors by k-means
│ dataset.py # BDD100K dataset
│ loss.py # Focal, tversky (dice)
│ model.py # Model blocks
│
├───projects
│ bdd100k.yml # Project configuration
│
├───ros # C++ ROS Package for path planning
│ ...
│
└───utils
| constants.py
│ plot.py # Draw bounding box
│ smp_metrics.py # https://github.com/qubvel/segmentation_models.pytorch/blob/master/segmentation_models_pytorch/metrics/functional.py
│ utils.py # Various helper functions (preprocess, postprocess, eval...)
Getting Started 
Installation
The project was developed with Python>=3.7 and Pytorch>=1.10.
git clone https://github.com/datvuthanh/HybridNets
cd HybridNets
pip install -r requirements.txt
Demo
# Download end-to-end weights
curl --create-dirs -L -o weights/hybridnets.pth https://github.com/datvuthanh/HybridNets/releases/download/v1.0/hybridnets.pth
# Image inference
python hybridnets_test.py -w weights/hybridnets.pth --source demo/image --output demo_result --imshow False --imwrite True
# Video inference
python hybridnets_test_videos.py -w weights/hybridnets.pth --source demo/video --output demo_result
# Result is saved in a new folder called demo_result
Usage
Data Preparation
Recommended dataset structure:
HybridNets
└───datasets
├───imgs
│ ├───train
│ └───val
├───det_annot
│ ├───train
│ └───val
├───da_seg_annot
│ ├───train
│ └───val
└───ll_seg_annot
├───train
└───val
Update your dataset paths in projects/your_project_name.yml.
For BDD100K:
Training
1) Edit or create a new project configuration, using bdd100k.yml as a template. Augmentation params are here.
# mean and std of dataset in RGB order
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
# bdd100k anchors
anchors_scales: '[2**0, 2**0.70, 2**1.32]'
anchors_ratios: '[(0.62, 1.58), (1.0, 1.0), (1.58, 0.62)]'
# BDD100K officially supports 10 classes
# obj_list: ['person', 'rider', 'car', 'truck', 'bus', 'train', 'motorcycle', 'bicycle', 'traffic light', 'traffic sign']
obj_list: ['car']
obj_combine: ['car', 'bus', 'truck', 'train'] # if single class, combine these classes into 1 single class in obj_list
# leave as empty list ([]) to not combine classes
seg_list: ['road',
'lane']
seg_multilabel: false # a pixel can belong to multiple labels (i.e. lane line + underlying road)
dataset:
dataroot: path/to/imgs
labelroot: path/to/det_annot
segroot:
# must be in correct order with seg_list
- path/to/da_seg_annot
- path/to/ll_seg_annot
fliplr: 0.5
flipud: 0.0
hsv_h: 0.015
hsv_s: 0.7
hsv_v: 0.4
...
2) Train
python train.py -p bdd100k # your_project_name
-c 3 # coefficient of effnet backbone, result from paper is 3
OR -bb repvgg_b0 # change your backbone with timm
-n 4 # num_workers
-b 8 # batch_size per gpu
-w path/to/weight # use 'last' to resume training from previous session
--freeze_det # freeze detection head, others: --freeze_backbone, --freeze_seg
--lr 1e-5 # learning rate
--optim adamw # adamw | sgd
--num_epochs 200
Please check python train.py --help for cheat codes.
~~IMPORTANT~~ (deprecated): If you want to train on multiple gpus, use train_ddp.py. Tested on NVIDIA DGX with 8xA100 40GB.
Why didn't we combine DDP into the already existing train.py script?
- Lots of if-else.
- Don't want to break functioning stuffs.
- Lazy.
Update 24/06/2022: train_ddp.py broke because we have a lot of things changed. Therefore, we decided to write a merged train.py with DDP support for easier maintainance. In the meantime, please clone this commit with a working train_ddp.py script if you really have to.
3) Evaluate
python val.py -w checkpoints/weight.pth
Again, check python val.py --help for god mode.
Validation process got killed! What do I do? => This is because we use a default confidence threshold of 0.001 to compare with other networks. So when calculating metrics, it has to handle a large amount of bounding boxes, leading to out-of-memory, and finally exploding the program before the next epoch.
That being said, there are multiple ways to circumvent this problem, choose the best that suit you:
- Train on a high-RAM instance (RAM as in main memory, not VRAM in GPU). For your reference, we can only val the combined
carclass with 64GB RAM. - Train with
python train.py --cal_map Falseto not calculate metrics when validating. This option will only print validation losses. When the losses seem to flatten and the weather is nice, rent a high-RAM instance to validate the best weight withpython val.py -w checkpoints/xxx_best.pth. We actually did this to save on cost. - Reduce the confidence threshold with
python train.py --conf_thres 0.5orpython val.py --conf_thres 0.5,
