RankSortLoss
Official PyTorch Implementation of Rank & Sort Loss for Object Detection and Instance Segmentation [ICCV2021]
Install / Use
/learn @kemaloksuz/RankSortLossREADME
Rank & Sort Loss for Object Detection and Instance Segmentation
The official implementation of Rank & Sort Loss. Our implementation is based on mmdetection.
Rank & Sort Loss for Object Detection and Instance Segmentation,
Kemal Oksuz, Baris Can Cam, Emre Akbas, Sinan Kalkan, ICCV 2021 (Oral Presentation). (arXiv pre-print)
Summary
What is Rank & Sort (RS) Loss? Rank & Sort (RS) Loss supervises object detectors and instance segmentation methods to (i) rank the scores of the positive anchors above those of negative anchors, and at the same time (ii) sort the scores of the positive anchors with respect to their localisation qualities.
<p align="center"> <img src="assets/Teaser.png" width="600"> </p>Benefits of RS Loss on Simplification of Training. With RS Loss, we significantly simplify training: (i) Thanks to our sorting objective, the positives are prioritized by the classifier without an additional auxiliary head (e.g. for centerness, IoU, mask-IoU), (ii) due to its ranking-based nature, RS Loss is robust to class imbalance, and thus, no sampling heuristic is required, and (iii) we address the multi-task nature of visual detectors using tuning-free task-balancing coefficients.
<p align="center"> <img src="assets/Architecture.png" width="600"> </p>Benefits of RS Loss on Improving Performance. Using RS Loss, we train seven diverse visual detectors only by tuning the learning rate, and show that it consistently outperforms baselines: e.g. our RS Loss improves (i) Faster R-CNN by ~3 box AP and aLRP Loss (ranking-based baseline) by ~2 box AP on COCO dataset, (ii) Mask R-CNN with repeat factor sampling by 3.5 mask AP (~7 AP for rare classes) on LVIS dataset.
How to Cite
Please cite the paper if you benefit from our paper or the repository:
@inproceedings{RSLoss,
title = {Rank & Sort Loss for Object Detection and Instance Segmentation},
author = {Kemal Oksuz and Baris Can Cam and Emre Akbas and Sinan Kalkan},
booktitle = {International Conference on Computer Vision (ICCV)},
year = {2021}
}
Specification of Dependencies and Preparation
- Please see get_started.md for requirements and installation of mmdetection.
- Please refer to introduction.md for dataset preparation and basic usage of mmdetection.
Trained Models
Here, we report minival results in terms of AP and oLRP.
Multi-stage Object Detection
RS-R-CNN
| Backbone | Epoch | Carafe | MS train | box AP | box oLRP | Log | Config | Model | | :-------------: | :-----: | :-----: | :------------: | :------------: | :------------: | :-------: | :-------: | :-------: | | ResNet-50 | 12 | | | 39.6 | 67.9 |log| config | model | | ResNet-50 | 12 | + | | 40.8 | 66.9 |log| config | model | | ResNet-101-DCN | 36 | | [480,960] | 47.6 | 61.1 |log| config | model | | ResNet-101-DCN | 36 | + | [480,960] | 47.7 | 60.9 |log| config | model |
RS-Cascade R-CNN
| Backbone | Epoch | box AP | box oLRP | Log | Config | Model | | :-------------: | :-----: | :------------: | :------------: | :-------: | :-------: |:-------: | | ResNet-50 | 12 | 41.3 | 66.6 |log | config | model |
One-stage Object Detection
| Method | Backbone | Epoch | box AP | box oLRP | Log | Config | Model | | :-------------: | :-----: | :-----: | :------------: | :------------: | :-------: | :-------: | :-------: | | RS-ATSS | ResNet-50 | 12 | 39.9 | 67.9 |log| config | model | | RS-PAA | ResNet-50 | 12 | 41.0 | 67.3 |log| config | model |
Multi-stage Instance Segmentation
RS-Mask R-CNN on COCO Dataset
| Backbone | Epoch | Carafe | MS train | mask AP | box AP | mask oLRP | box oLRP | Log | Config | Model | | :-------------: | :-----: | :-----: | :------------: | :------------: | :------------: | :------------: | :------------: | :-------: | :-------: |:-------: | | ResNet-50 | 12 | | | 36.4 | 40.0 | 70.1 | 67.5 |log| config | model | | ResNet-50 | 12 | + | | 37.3 | 41.1 | 69.4 | 66.6 |log| config | model | | ResNet-101 | 36 | | [640,800] | 40.3 |44.7 | 66.9 | 63.7 |log| config | model | | ResNet-101| 36 | + | [480,960] | 41.5 | 46.2 | 65.9 | 62.6 |log| config | model | | ResNet-101-DCN | 36 | + | [480,960] | 43.6 | 48.8 | 64.0 | 60.2 |log| config | model | | ResNeXt-101-DCN | 36 | + | [480,960] | 44.4 | 49.9 | 63.1 | 59.1 | Coming Soon | config | model |
RS-Mask R-CNN on LVIS Dataset
| Backbone | Epoch | MS train | mask AP | box AP | mask oLRP | box oLRP | Log | Config | Model | | :-------------: | :-----: | :------------: | :------------: | :------------: | :------------: | :------------: | :-------: | :-------: |:-------: | | ResNet-50 | 12 | [640,800] | 25.2 | 25.9 | Coming Soon | Coming Soon | log| config | model |
One-stage Instance Segmentation
RS-YOLACT
| Backbone | Epoch | mask AP | box AP | mask oLRP | box oLRP | Log | Config | Model | | :-------------: | :-----: | :------------: | :------------: | :------------: | :------------: | :-------: | :-------: |:-------: | | ResNet-50 | 55 | 29.9 | 33.8 | 74.7 | 71.8 |log| config | model |
RS-SOLOv2
The implementation of Rank & Sort Loss on Solov2 is released in a seperate repository due to mmdetection version difference. You can check out our RS-Solov2 implementation in this repository. Any pull request to incorporate RS-Solov2 to this repository is highly appreciated.
| Backbone | Epoch | mask AP | mask oLRP | | :---------: | :-----: | :------------: | :------------: | | ResNet-34 | 36 | 32.6 | 72.7 | | ResNet-101 | 36 | 39.7 | 66.9 |
Running the Code
Training Code
The configuration files of all models listed above can be found in the configs/ranksort_loss folder. You can follow get_started.md for training code. As an example, to train Faster R-CNN with our RS Loss on 4 GPUs as we did, use the following
Related Skills
node-connect
344.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
99.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
344.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
344.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
