NWD

Official code for "A Normalized Gaussian Wasserstein Distance for Tiny Object Detection"

Generate Convert Improve

Install / Use

/learn @jwwangchn/NWD

About this skill

Quality Score

0/100

README

A Normalized Gaussian Wasserstein Distance for Tiny Object Detection

This is the official code for the NWD. The expanded method is accepted by the ISPRS J P & RS in 2022.

Installation

Requirements

Linux
Python 3.7 (Python 2 is not supported)
PyTorch 1.5 or higher
CUDA 10.1 or higher
NCCL 2
GCC(G++) 5.4 or higher
mmcv-nwd==1.3.5
cocoapi-aitod==12.0.3

We have tested the following versions of OS and softwares:

OS: Ubuntu 16.04
GPU: TITAN X
CUDA: 10.1
GCC(G++): 5.5.0
PyTorch: 1.5.0+cu101
TorchVision: 0.6.0+cu101
MMCV: 1.3.5
MMDetection: 2.13.0

Install

a. Create a conda virtual environment and activate it.

conda create -n nwd python=3.7 -y
conda activate nwd

b. Install PyTorch stable or nightly and torchvision following the official instructions, e.g.,

pip install torch==1.5.0+cu101 torchvision==0.6.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html

c. Install MMCV-NWD

git clone https://github.com/jwwangchn/mmcv-nwd.git
cd mmcv-nwd
MMCV_WITH_OPS=1 pip install -e .  # package mmcv-full will be installed after this step
cd ../

d. Install COCOAPI-AITOD for Evaluating on AI-TOD dataset

pip install "git+https://github.com/jwwangchn/cocoapi-aitod.git#subdirectory=aitodpycocotools"

e. Install NWD

git clone https://github.com/jwwangchn/NWD.git
# optional
pip install -r requirements.txt

python setup.py develop
# or "pip install -v -e ."

Prepare datasets

Please refer to AI-TOD for AI-TOD dataset.

It is recommended to symlink the dataset root to $NWD/data. If your folder structure is different, you may need to change the corresponding paths in config files (configs/base/datasets/aitod_detection.py).

NWD
├── mmdet
├── tools
├── configs
├── data
│   ├── AI-TOD
│   │   ├── annotations
│   │   │    │─── aitod_training_v1.json
│   │   │    │─── aitod_validation_v1.json
│   │   ├── trainval
│   │   │    │─── ***.png
│   │   │    │─── ***.png
│   │   ├── test
│   │   │    │─── ***.png
│   │   │    │─── ***.png

Run

The NWD's config files are in configs/nwd.

Please see MMDetection full tutorials with existing dataset for beginners.

Training on a single GPU

The basic usage is as follows (e.g. train Faster R-CNN with NWD). Note that the lr=0.01 in config file needs to be lr=0.01/4 for training on single GPU.

python tools/train.py configs/nwd/faster_rcnn_r50_aitod_rpn_nwd.py

Training on multiple GPUs

The basic usage is as follows (e.g. train Faster R-CNN with NWD).

bash ./tools/dist_train.sh configs/nwd/faster_rcnn_r50_aitod_rpn_nwd.py 4

Inference

Benchmark

The benchmark and trained models will be publicly available soon.

Citation

@inproceedings{AI-TOD_2020_ICPR,
    title={Tiny Object Detection in Aerial Images},
    author={Wang, Jinwang and Yang, Wen and Guo, Haowen and Zhang, Ruixiang and Xia, Gui-Song},
    booktitle=ICPR,
    pages={3791--3798},
    year={2021},
}

@article{NWD_2021_arXiv,
    title={A Normalized Gaussian Wasserstein Distance for Tiny Object Detection},
    author={Wang, Jinwang and Xu, Chang and Yang, Wen and Yu, Lei},
    journal={arXiv preprint arXiv:2110.13389},
    year={2021}
}

@article{NWD_RKA_2022_ISPRSJ,
    title={Detecting Tiny Objects in Aerial Images: A Normalized Wasserstein Distance and A New Benchmark},
    author={Xu, Chang and Wang, Jinwang and and Yang, Wen and Yu, Huai and Yu, Lei and Xia, Gui-Song},
    journal={ISPRS Journal of Photogrammetry and Remote Sensing (ISPRS J P & RS)},
    year={2022}
}

Related Skills

node-connect

350.8k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

110.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

350.8k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

350.8k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。