GroupDETR

[ICCV 2023] Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment

Generate Convert Improve

Install / Use

/learn @Atten4Vis/GroupDETR

About this skill

Quality Score

0/100

README

Group DETR

This repository is an official implementation of the ICCV 2023 paper "Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment".

Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment

Qiang Chen*, Xiaokang Chen*, Jian Wang, Shan Zhang, Kun Yao, Haocheng Feng, Junyu Han, Errui Ding, Gang Zeng, Jingdong Wang

Baidu VIS, Peking University, Australian National University

TODO

[ ] Update the arXiv paper
[ ] Release code and models

Introduction

In this paper, we introduce Group DETR, a simple yet efficient DETR training approach that introduces a group-wise way for one-to-many assignment. This approach involves using multiple groups of object queries, conducting one-to-one assignment within each group, and performing decoder self-attention separately for each group. It resembles data augmentation with automatically-learned object query augmentation, and is also equivalent to simultaneously training parameter-sharing networks of the same architecture, introducing more supervision and thus improving DETR training. The inference process is the same as DETR trained normally, and only needs one group of queries without any architecture modification. Group DETR is versatile and is applicable to various DETR variants. The experiments show that Group DETR significantly speeds up the training convergence and improves the performance of various DETR-based applications.

Citation

If you use Group DETR in your research or wish to refer to the baseline results published here, please use the following BibTeX entry.

@inproceedings{chen2023group,
      title={Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment},
      author={Chen, Qiang and Chen, Xiaokang and Wang, Jian and Zhang, Shan and Yao, Kun and Feng, Haocheng and Han, Junyu and Ding, Errui and Zeng, Gang and Wang, Jingdong},
      booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
      year={2023}
    }

Related Skills

node-connect

342.5k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

85.3k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

342.5k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

342.5k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。