SoMA
[CVPR 2025 Highlight] SoMA: Singular Value Decomposed Minor Components Adaptation for Domain Generalizable Representation Learning
Install / Use
/learn @ysj9909/SoMAREADME
[CVPR 2025] SoMA: Singular Value Decomposed Minor Components Adaptation for Domain Generalizable Representation Learning
Seokju Yun, Seunghye Chae, Dongheon Lee, Youngmin Ro
Project Page | arXiv

Important Notice: The abbreviation for the method has been changed from SoRA to SoMA. However, please note that this change has not been reflected in the code.
@InProceedings{Yun_2025_CVPR,
author = {Yun, Seokju and Chae, Seunghye and Lee, Dongheon and Ro, Youngmin},
title = {SoMA: Singular Value Decomposed Minor Components Adaptation for Domain Generalizable Representation Learning},
booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
month = {June},
year = {2025},
pages = {25602-25612}
}
Domain Generalized Semantic Segmentation Performance (DINOv2)
|Setting |Crop Size |mIoU (Avg)|Config|Adapter&Head Checkpoint| |:---:|:---:|:---:|:---:| :---:| |GTAV $\rightarrow$ Cityscapes, BDD, Mapillary|512 |68.27|config| checkpoint |GTAV + Synthia $\rightarrow$ Cityscapes, BDD, Mapillary|512 |69.26|config| checkpoint |GTAV + Synthia + UrbanSyn $\rightarrow$ Cityscapes, BDD, Mapillary|512 |71.68|config| checkpoint |GTAV + Synthia + UrbanSyn $\rightarrow$ Cityscapes, BDD, Mapillary|1024 |73.12|config| checkpoint |GTAV + Synthia + UrbanSyn $\rightarrow$ 1/16 of Cityscapes $\rightarrow$ Cityscapes, BDD, Mapillary|1024 |75.50|config| checkpoint | Cityscapes $\rightarrow$ BDD, Mapillary|512 |71.74|config| checkpoint | Cityscapes $\rightarrow$ ACDC (test set)|1024 |78.75|config| checkpoint
Domain Generalized Object Detection Performance (DINOv2)
|Setting |Input Size|DS|NC|DR|NR|DF|Config|Adapter&Head Checkpoint| |:---:|:---:|:---:|:---:| :---:| :---:| :---:| :---:| :---:| |Clear to Adverse Weather|1024 |69.4|59.3|59.3|47.6|51.0|config| checkpoint
Environment Setup
To set up your environment, execute the following commands:
conda create -n soma -y
conda activate soma
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia -y
pip install -U openmim
mim install mmengine
mim install "mmcv==2.0.0"
pip install "mmsegmentation>=1.0.0"
pip install "mmdet>=3.0.0"
pip install xformers=='0.0.20' # optional for DINOv2
pip install -r requirements.txt
pip install future tensorboard
pip install peft=='0.11.1'
pip install transformers=='4.42.4'
Dataset Preparation
The dataset preparation process follows the procedures of Rein for DGSS and Diverse Weather for DGOD. Please refer to the respective repositories for details.
Transform Pre-trained Weights
- Download: Download pre-trained weights from facebookresearch for testing. Place them in the project directory without changing the file name.
- Convert: Convert pre-trained weights for training or evaluation.
(optional for 1024x1024 resolution)python tools/convert_models/convert_dinov2_sora.py checkpoints/dinov2_vitl14_pretrain.pth checkpoints/dinov2_sora_converted.pthpython tools/convert_models/convert_dinov2_sora.py checkpoints/dinov2_vitl14_pretrain.pth checkpoints/dinov2_sora_converted_1024x1024.pth --height 1024 --width 1024
Using Our Trained Checkpoints
To use or evaluate our trained checkpoint, you must merge the provided SoMA adapter & decode head weights with one of the following backbone checkpoints:
checkpoints/dinov2_sora_converted.pthcheckpoints/dinov2_sora_converted_1024x1024.pth
Please use the script tools/merge_soma_weights.py to perform this merging process:
python tools/merge_soma_weights.py --backbone_ckpt checkpoints/dinov2_sora_converted.pth --soma_ckpt checkpoints/soma_checkpoints/soma_dinov2-L_g2cbm_best.pth --merged_ckpt checkpoints/merged_checkpoints/soma_dinov2-L_g2cbm_best.pth
For 1024x1024 crop size,
python tools/merge_soma_weights.py --backbone_ckpt checkpoints/dinov2_sora_converted_1024x1024.pth --soma_ckpt checkpoints/soma_checkpoints/soma_dinov2-L_gsu2cbm_1024x1024_best.pth --merged_ckpt checkpoints/merged_checkpoints/soma_dinov2-L_gsu2cbm_1024x1024_best.pth
To extract the adapter and decode head weights from a trained model, please refer to the script tools/extract_soma_weights.py.
Evaluation
Run the evaluation:
python tools/test.py /path/to/cfg /path/to/checkpoint
Training
Start training in single GPU:
python tools/train.py /path/to/cfg
Start training in multiple GPU:
PORT=12345 CUDA_VISIBLE_DEVICES=1,2,3,4 bash tools/dist_train.sh /path/to/cfg NUM_GPUS
Acknowledgements
We sincerely appreciate mmsegmentation, mmdetection, Rein, Single-DGOD, and peft for their wonderful implementations.
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
16.5kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
