PFGF

[CVPR' 25] Official implementation of the paper "Pseudo Visible Feature Fine-Grained Fusion for Thermal Object Detection"

Generate Convert Improve

Install / Use

/learn @liting1018/PFGF

About this skill

Quality Score

0/100

README

Pseudo Visible Feature Fine-Grained Fusion for Thermal Object Detection (CVPR-25)

Environmental Requirements

We utilize YOLOX, implemented via the MMDetection for environment setup instructions.

Our development environment includes the following dependencies:

python==3.9
torch==1.12.1+cu116
torchvision==0.13.1+cu116
mmcv-full==1.7.2
mmdet==2.26.0

Additionally, install the official Mamba library by following the instructions in the hustvl/Vim repository. After installation, replace the mamba_simpy.py file in the installation directory with the version available in the mamba block directory of the Pan-Mamba repository.

About the Code

This repository contains only the modifications made to the MMDetection codebase. For example:

Add the code in mmdetection/mmdet/datasets/FLIR.py to your MMDetection.
Ensure all newly added classes are registered in __init__.py.

Dataset and Models

Datasets and model checkpoints can be downloaded from this cloud link, with the extraction code: PFGF.
Download the Pearl-GAN pretrained weights from https://github.com/FuyaLuo/PearlGAN/. Place them into configs/graphmamba/pearlgan_ckpt/FLIR_NTIR2DC/.

Inference

To evaluate the FLIR dataset, run the following command:

python tools/test.py configs/graphmamba/yolox_l_tirgraphmamba_1x8_200e_FLIR_r.py work_dirs/flir.pth --eval mAP

Training

To train the model on the FLIR dataset, use the command:

python tools/train.py configs/graphmamba/yolox_l_tirgraphmamba_1x8_200e_FLIR_r.py

Acknowledgement

This project is based on mmdetection, DATFF, Pan-Mamba, Cas-Gnn. Thanks for their wonderful works.

Citation

If you find our PFGF framework useful, please consider citing our paper:

@inproceedings{li2025pseudo,
  title={Pseudo Visible Feature Fine-Grained Fusion for Thermal Object Detection},
  author={Li, Ting and Ye, Mao and Wu, Tianwen and Li, Nianxin and Li, Shuaifeng and Tang, Song and Ji, Luping},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={6710--6719},
  year={2025}
}

Related Skills

node-connect

347.6k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

108.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

347.6k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

347.6k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。