TGN
This is the official implementation for our TGRS 2024 paper "Text-Guided Diverse Image Synthesis for Long-Tailed Remote Sensing Object Classification".
Install / Use
/learn @XinR-Tang/TGNREADME
English | 简体中文
<div style="text-align: center; margin: 10px"> <h1> ⭐ TGN: Text-Guided Diverse Image Synthesis for Long-Tailed Remote Sensing Object Classification </h1> </div> <p align="center"> <a href="https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=36"> <img alt="Static Badge" src="https://img.shields.io/badge/TGRS-blue?logo=ieee&labelColor=blue&color=blue"> </a> <a href="https://ieeexplore.ieee.org/document/10582893"> <img alt="Static Badge" src="https://img.shields.io/badge/Paper-openproject.svg?logo=openproject&color=%23B31B1B"> </a> <a href=""><img src="https://img.shields.io/badge/python-3.8+-aff.svg"></a> <a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win-pink.svg"></a> <img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/XinR-Tang/TGN"> <a href="mailto: tanghaojun_cam@163.com"> <img alt="Static Badge" src="https://img.shields.io/badge/contact_me-email-yellow"> </a> </p>🌋 Notes
This is the official implementation for our <span style='color: #EB5353;font-weight:bold'>TGRS 2024</span> paper "Text-Guided Diverse Image Synthesis for Long-Tailed Remote Sensing Object Classification". You can quickly implement our work through this project. If you have any questions, please contact us!

💡 Introduction
TGN comprises two main components: knowledge mutual distillation network (KMDN) and class-consistent diverse tail class generation network (CDTG). KMDN resolves the isolation issue of the head and tail knowledge by facilitating mutual learning of feature representations between the head and tail data, thereby improving the feature extraction capability of the tail model. CDTG focuses on generating class-consistency diverse tail class images that uses tail-class features extracted by KMDN. Especially, the class consistency is guaranteed by CLIP’s powerful text-image alignment capability. These generated images are then added back into the original dataset to alleviate the long-tailed distribution, thereby improving the tail class accuracy.

🚀 Quick start
📍 Install
pip install -r requirements.txt
🏕️ Preparing the dataset
We conduct experiments on three remote sensing datasets: DIOR, FGSC-23 and DOTA.
DIOR: Contains 192,465 images from 20 categories, 68,025 samples for training and 124,440 samples for testing.FGSC: Contains 4,081 images from 23 categories, 3,256 samples for training and 825 samples for testing.DOTA: Contains 127759 images from 15 categories, 98906 samples for training and 28853 samples for testing.
You can download the preprocessed datasets from Datasets; Extract code: wo4j
- Make sure your project is structured as follows:
├── CDTG
│ ├── checkpoint
│ ├── lpips
│ | ...
│
├── Classification
│ ├── Datasets.py
│ ├── model_finetune.py
│ ├── test.py
│ ├── train.py
│ ├── Utils.py
│
├── KMDN
│ ├── dataset_split.py
│ ├── Datasets.py
│ ├── Distill.py
│ ├── ....
│
├── dior
├── DOTA
├── FGSC-23
├── README.md
├── requirements.txt
- If you want to use your own dataset, make sure the dataset has the same structure as follows:
├── dior
│ ├── anno
│ │ ├── DIOR_train.txt
│ │ ├── DIOR_test.txt
│ │
│ ├── train
│ │ ├── 0
│ │ │ ├── 00008_0.jpg
│ │ │ ├── ...
│ │ │
│ │ ├── 1
│ │ ├── ...
│ │
│ ├── test
│ │ ├── 0
│ │ │ ├── 11726_0.jpg
│ │ │ ├── ...
│ │ │
│ │ ├── 1
│ │ ├── ...
- Before start, run the following command to split the head dataset and the tail dataset:
cd KMDN
python3 dataset_split.py
Note: dataset_split.py is included in both KMDN and CDTG, but they perform different functions!
🔥 Preparing the pre-trained weights
We provide the weights for the training. You can quickly implement our work with these weights.
result.pth: You can use this weight to implement our pre-trained classification network. This weight is trained on a dataset with 7000 generated samples added for each tail class image. Baidu Netdisk; Extract code:ii10.200000.pt: You can use this weight to generate the tail class image. Baidu Netdisk; Extract code:9h1w.
🏕️ Testing
You can quickly reproduce our results with the following command(result.pth is placed in "./Classification/save_model/result.pth"):
cd Classification
python3 test.py
🦄 Train and Evaluation
🔥 KMDN
- First, train the head and tail models separately using the following command:
cd KMDN
python3 dataset_split.py
python3 UModel.py
python3 RModel.py
- Then run the following command to perform knowledge mutual distillation:
python3 Distill.py
🔥 CDTG
- The dataset structure required by CDTG is as follows:
├── dior
│ ├── train
│ │ ├── tail
│ │ │ ├── 0
│ │ │ │ ├── 00008_0.jpg
│ │ │ │ ├── ...
│ │ │
│ │ │ ├── 1
│ │ │ ├── ...
- You can run the following command to automatically split into this structure.
cd CDTG
python3 dataset_split.py
If you need to use your own dataset, make sure it has the same structure.
- Run the following command to train the CDTG:
python3 train.py --ckpt checkpoint/your_model_path
🔥 Generation
- Finally, you can run the following command to generate different classes of tail images:
cd CDTG
python3 generation.py --ckpt checkpoint/your_model_path --folder_number 0 --r 0.05
folder_number:The class labels for generated image.r:Proportion of noise
your_model_path can be replaced by 200000.pt
🔥 Evaluation
Use the following command to train a classifier on the expanded dataset:
cd Classification
python3 train.py
Citation
If you find this work useful in your research, please cite our paper:
@ARTICLE{tgn,
author={Tang, Haojun and Zhao, Wenda and Hu, Guang and Xiao, Yi and Li, Yunlong and Wang, Haipeng},
journal={IEEE Transactions on Geoscience and Remote Sensing},
title={Text-Guided Diverse Image Synthesis for Long-Tailed Remote Sensing Object Classification},
year={2024},
volume={},
number={},
pages={1-1},
keywords={Long-tailed remote sensing object classification;Knowledge mutual distillation;Class-consistent diverse image synthesis},
doi={10.1109/TGRS.2024.3422095}}
Acknowledgements
- This repository is built upon SatConcepts.
- Thank the authors of these open source repositories for their efforts. And thank the ACs and reviewers for their effort when dealing with our paper.
Related Skills
node-connect
349.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
349.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
349.0kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
