SkillAgentSearch skills...

CMSSM

CM-SSM: Cross-modal State Space Model for Real-time RGB-Thermal Wild Scene Sementic Segmentation

Install / Use

/learn @xiaodonguo/CMSSM
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Cross-modal State Space Modeling and Terrain-specific Knowledge Distillation for RGB-Thermal Semantic Segmentation

Introduction

This repository contains the code for the paper "Cross-modal State Space Modeling for Real-time RGB-Thermal Wild Scene Semantic Segmentation," which has been accepted by IROS 2025.

2025-10-9✨ : An extended version of our conference paper, "Cross-modal State Space Modeling and Terrain-specific Knowledge Distillation for RGB-Thermal Semantic Segmentation", has been submitted to TASE. For the convenience of the review process, more details and codes are provided.

Method

picture1 The CM-SSM consists of two image encoders to extract the features of RGB and thermal images, four CM-SSA moudules to perform RGB-T feature fusion in four stages, and an MLP decoder to predict the semantic segmentation maps.

picture2 The CM-SS2D consists of three steps: 1) cross-modal selective scanning, 2) cross-modal state space modeling and 3) scan merging.

Reqiurements

Python==3.9
Pytorch==2.0.1
Cuda==11.8
mamba-ssm==1.0.1
causal-conv1d==1.0.0
mmcv==2.2.0

| Models |Backbone| Dataset | mIoU | Weights| |------|------|------------|------|--------------| | CM-SSM|EfficientVit-B1 | CART | 75.1 | pth | | CM-SSM|EfficeintVit-B1 | PST900 | 85.9 | pth | | CM-SSM|ConvNeXtV2-A | SUS | 82.5 | pth | | CM-SSM|ConvNeXtV2-A | FMB | 60.7 | pth |

Related Skills

View on GitHub
GitHub Stars7
CategoryDevelopment
Updated5mo ago
Forks0

Languages

Python

Security Score

67/100

Audited on Oct 28, 2025

No findings