DeMo
【AAAI2025】DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification
Install / Use
/learn @924973292/DeMoREADME
DeMo is an advanced multi-modal object Re-Identification (ReID) framework designed to tackle dynamic imaging quality variations across modalities. By employing decoupled features and a novel Attention-Triggered Mixture of Experts (ATMoE), DeMo dynamically balances modality-specific and modality-shared information, enabling robust performance even under missing modality conditions. The framework sets new benchmarks for multi-modal and missing-modality object ReID.
News
- We released the DeMo codebase and paper! 🚀 Paper
- Great news! Our paper has been accepted to AAAI 2025! 🎉
Table of Contents
Introduction
Multi-modal object ReID combines the strengths of different modalities (e.g., RGB, NIR, TIR) to achieve robust identification across challenging scenarios. DeMo introduces a decoupled approach using Mixture of Experts (MoE) to preserve modality uniqueness and enhance diversity. This is achieved through:
- Patch-Integrated Feature Extractor (PIFE): Captures multi-granular representations.
- Hierarchical Decoupling Module (HDM): Separates modality-specific and shared features.
- Attention-Triggered Mixture of Experts (ATMoE): Dynamically adjusts feature importance with adaptive attention-guided weights.
Contributions
- Introduced a decoupled feature-based MoE framework, DeMo, addressing dynamic quality changes in multi-modal imaging.
- Developed the Hierarchical Decoupling Module (HDM) for enhanced feature diversity and Attention-Triggered Mixture of Experts (ATMoE) for context-aware weighting.
- Achieved state-of-the-art performance on RGBNT201, RGBNT100, and MSVR310 benchmarks under both full and missing-modality settings.
Results
Multi-Modal Object ReID
Multi-Modal Person ReID [RGBNT201]
<p align="center"> <img src="results/RGBNT201.png" alt="RGBNT201 Results" style="width:100%;"> </p>Multi-Modal Vehicle ReID [RGBNT100 & MSVR310]
<p align="center"> <img src="results/RGBNT100_MSVR310.png" alt="RGBNT100 Results" style="width:100%;"> </p>Missing-Modality Object ReID
Missing-Modality Performance [RGBNT201]
<p align="center"> <img src="results/RGBNT201_M.png" alt="RGBNT201 Missing-Modality" style="width:100%;"> </p>Missing-Modality Performance [RGBNT100]
<p align="center"> <img src="results/RGBNT100_M.png" alt="RGBNT100 Missing-Modality" style="width:100%;"> </p>Ablation Studies [RGBNT201]
<p align="center"> <img src="results/Ablation.png" alt="RGBNT201 Ablation" style="width:100%;"> </p>Visualizations
Feature Distribution (t-SNE)
<p align="center"> <img src="results/tsne.png" alt="t-SNE" style="width:100%;"> </p>Decoupled Features
<p align="center"> <img src="results/Decoupled.png" alt="Decoupled Features" style="width:100%;"> </p>Rank-list Visualization
<p align="center"> <img src="results/rank-list.png" alt="Rank-list" style="width:100%;"> </p>Reproduction
Datasets
- RGBNT201: Google Drive
- RGBNT100: Baidu Pan (Code:
rjin) - MSVR310: Google Drive
Pretrained Models
Configuration
- RGBNT201:
configs/RGBNT201/DeMo.yml - RGBNT100:
configs/RGBNT100/DeMo.yml - MSVR310:
configs/MSVR310/DeMo.yml
Training
conda create -n DeMo python=3.8.12 -y
conda activate DeMo
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
cd (your_path)
pip install -r requirements.txt
python train_net.py --config_file configs/RGBNT201/DeMo.yml
Notes
- This repository is based on MambaPro. The prompt and adapter tuning on the CLIP backbone are reserved (the corresponding hyperparameters are set to
False), allowing users to explore them independently. - This code provides multi-modal Grad-CAM visualization, multi-modal ranking list generation, and t-SNE visualization tools to facilitate further research.
- The hyperparameter configuration is designed to ensure compatibility with devices equipped with less than 24GB of memory.
- Thank you for your attention and interest!
Star History
Citation
If you find DeMo helpful in your research, please consider citing:
@inproceedings{wang2025DeMo,
title={DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification},
author={Wang, Yuhao and Liu, Yang and Zheng, Aihua and Zhang, Pingping},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2025}
}
