CIPA
The official code of "Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images", CVPR 2025
Install / Use
/learn @mj129/CIPAREADME
Jie Mei<sup>1</sup>, Chenyu Lin<sup>2</sup>, Yu Qiu<sup>3</sup>, Yaonan Wang<sup>1</sup>, Hui Zhang<sup>1</sup>, Ziyang Wang<sup>4</sup>, Dong Dai<sup>4</sup></sup>
<sup>1</sup> Hunan University, <sup>2</sup> Nankai University, <sup>3</sup> Hunan Normal University, <sup>4</sup> Tianjin Medical University Cancer Institute and Hospital
</div>- This repository contains the official code for paper Multi-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images.
- This paper has been accepted to CVPR 2025.
- This code and PCLT20K dataset are licensed for non-commerical research purpose only.
Introduction
<p style="text-align:justify; text-justify:inter-ideograph;"> Lung cancer is a leading cause of cancer-related deaths globally. PET-CT is crucial for imaging lung tumors, providing essential metabolic and anatomical information, while it faces challenges such as poor image quality, motion artifacts, and complex tumor morphology. Deep learning-based models are expected to address these problems, however, existing small-scale and private datasets limit significant performance improvements for these methods. Hence, we introduce a large-scale PET-CT lung tumor segmentation dataset, termed PCLT20K, which comprises 21,930 pairs of PET-CT images from 605 patients. Furthermore, we propose a cross-modal interactive perception network with Mamba (CIPA) for lung tumor segmentation in PET-CT images. Specifically, we design a channel-wise rectification module (CRM) that implements a channel state space block across multi-modal features to learn correlated representations and helps filter out modality-specific noise. A dynamic cross-modality interaction module (DCIM) is designed to effectively integrate position and context information, which employs PET images to learn regional position information and serves as a bridge to assist in modeling the relationships between local features of CT images. Extensive experiments on a comprehensive benchmark demonstrate the effectiveness of our CIPA compared to the current state-of-the-art segmentation methods. We hope our research can provide more exploration opportunities for medical image segmentation. </p>
Environment
-
Create environment.
conda create -n MIPA python=3.10 conda activate MIPA -
Install all dependencies. Install pytorch, cuda and cudnn, then install other dependencies via:
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118pip install -r requirements.txt -
Install selective_scan_cuda_core.
cd models/encoders/selective_scan pip install . cd ../../..
PCLT20K
Please contact Jie Mei (jiemei AT hnu.edu.cn) for the dataset. We will get back to you shortly. The email should contain the following information. Note: For better academic communication, a real-name system is encouraged and your email suffix must match your affiliation (e.g., hello@hnu.edu.cn). If not, you need to explain why.
Name: (Tell us who you are.)
Affiliation: (The name/url of your institution or university, etc.)
Job Title: (E.g., Professor, Associate Professor, PhD, etc.)
Email: (Dataset will be sent to this email.)
How to use: (Only for non-commercial use.)
Data Preparation
-
For our dataset PCLT20K, we orgnize the dataset folder in the following structure:
<PCLT20K> |-- <0001> |-- <name1_CT.png> |-- <name1_PET.png> |-- <name1_mask.png> ... |-- <0002> |-- <name2_CT.png> |-- <name2_PET.png> |-- <name2_mask.png> ... ... |-- train.txt |-- test.txttrain.txt/test.txtcontains the names of items in training/testing set, e.g.:<name1> <name2> ... -
Please put our dataset in the
datadirectory
Usage
Training
-
Please download the pretrained VMamba weights, and put them under
pretrained/vmamba/. We use VMamba_Tiny as default. -
Config setting.
Edit config in the
train.py. Change C.backbone tosigma_tiny/sigma_small/sigma_baseto use the three versions of VMamba. -
Run multi-GPU distributed training:
torchrun --nproc_per_node 'GPU_Numbers' train.py -
You can also use single-GPU training:
python train.py -
Results will be saved in
save_modelfolder.
Testing
The pretrained model of CIPA (CIPA.pth) can be downloaded:
- Baidu Yunpan, Password: CIPA
- Google Drive
python pred.py
Citation
If you are using the code/model provided here in a publication, please consider citing:
@inproceedings{mei2025cross,
title={Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images},
author={Mei, Jie and Lin, Chenyu and Qiu, Yu and Wang, Yaonan and Zhang, Hui and Wang, Ziyang and Dai, Dong},
booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2025}
}
Contact
For any questions, please contact me via e-mail: jiemei AT hnu.edu.cn.
Acknowledgment
This project is based on the VMamba and Sigma, thanks for their excellent works.
