DCNet
Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection, CVPR 2021
Install / Use
/learn @hzhupku/DCNetREADME
Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection, CVPR 2021
Our code is based on https://github.com/facebookresearch/maskrcnn-benchmark and developed with Python 3.6.5 & PyTorch 1.1.0.
Abstract
Conventional deep learning based methods for object detection require a large amount of bounding box annotations for training, which is expensive to obtain such high quality annotated data. Few-shot object detection, which learns to adapt to novel classes with only a few annotated examples, is very challenging since the fine-grained feature of novel object can be easily overlooked with only a few data available. In this work, aiming to fully exploit features of annotated novel object and capture fine-grained features of query object, we propose Dense Relation Distillation with Context-aware Aggregation (DCNet) to tackle the few-shot detection problem. Built on the meta-learning based framework, Dense Relation Distillation module targets at fully exploiting support features, where support features and query feature are densely matched, covering all spatial locations in a feed-forward fashion. The abundant usage of the guidance information endows model the capability to handle common challenges such as appearance changes and occlusions. Moreover, to better capture scale-aware features, Context-aware Aggregation module adaptively harnesses features from different scales for a more comprehensive feature representation. Extensive experiments illustrate that our proposed approach achieves state-of-the-art results on PASCAL VOC and MS COCO datasets. For more details, please refer to our CVPR paper (arxiv).
Installation
Check INSTALL.md for installation instructions. Since maskrcnn-benchmark has been deprecated, please follow these instructions carefully (e.g. version of Python packages).
Prepare datasets
Prepare original Pascal VOC & MS COCO datasets
First, you need to download the VOC & COCO datasets.
We recommend to symlink the path of the datasets to datasets/ as follows
We use minival and valminusminival sets from Detectron (filelink).
mkdir -p datasets/coco
ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
ln -s /path_to_coco_dataset/train2014 datasets/coco/train2014
ln -s /path_to_coco_dataset/test2014 datasets/coco/test2014
ln -s /path_to_coco_dataset/val2014 datasets/coco/val2014
ln -s /path_to_VOCdevkit_dir datasets/voc
Prepare base and few-shot datasets
For multiple runs, you need to specify the seed in the script.
bash tools/fewshot_exp/datasets/init_fs_dataset_standard.sh
This will also generate the datasets on base classes for base training.
Training and Evaluation
Scripts for training and evaluation on PASCAL VOC dataset.
experiments/DRD/
├── prepare_dataset.sh
├── base_train.sh
├── fine_tune.sh
└── get_result.sh
Configurations of base & few-shot experiments are:
experiments/DRD/configs/
├── base
│ └── e2e_voc_split*_base.yaml
└── standard
└── e2e_voc_split*_*shot_finetune.yaml
Modify them if needed. If you have any question about these parameters (e.g. batchsize), please refer to maskrcnn-benchmark for quick solutions.
Perform few-shot training on VOC dataset
- Run the following for base training on 3 VOC splits
cd experiments/DRD
bash base_train.sh
This will generate base models (e.g. model_voc_split1_base.pth) and corresponding pre-trained models (e.g. voc0712_split1base_pretrained.pth).
- Run the following for few-shot fine-tuning
bash fine_tune.sh
This will perform evaluation on 1/2/3/5/10 shot of 3 splits.
Result folder is fs_exp/voc_standard_results by default, and you can get a quick summary by:
bash get_result.sh
Citation
@inproceedings{hu2021dense,
title={Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection},
author={Hu, Hanzhe and Bai, Shuai and Li, Aoxue and Cui, Jinshi and Wang, Liwei},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={10185--10194},
year={2021}
}
TODO
- [ ] Context-aware Aggregation
- [ ] Training scripts on COCO dataset
Related Skills
node-connect
353.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
353.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
353.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
