WeakMCN
No description available
Install / Use
/learn @MRUIL/WeakMCNREADME
CVPR 2025 | WeakMCN
This repo is the official implementation of the paper "WeakMCN: Multi-task Collaborative Network for Weakly Supervised Referring Expression Comprehension and Segmentation"

Project structure
The directory structure of the project looks like this:
├── README.md <- The top-level README for developers using this project.
│
├── config <- configuration
│
├── data
│ ├── anns
│ ├── images
│ ├── masks
│
├── datasets <- dataloader file
├── EfficientSAM <- EfficientSAM directory
│
├── models <- Source code for use in this project.
│ ├── __init__.py
│ ├── language_encoder.py <- encoder for images' text descriptions
│ ├── network_blocks.py <- files included essential model blocks
│ ├── visual_encoder.py <- visual backbone
│ ├── weakmcn <- most important files for WeakMCN model implementations
│ │ ├── __init__.py
│ │ ├── head.py <- for anchor-prompt contrastive loss
│ │ ├── seg_head.py <- for segmentation head
│ │ ├── head.py <- for anchor-prompt contrastive loss
| | ├── net.py <- main code for WeakMCN model
│ │
│ │
├── utils <- hepler functions
├── requirements.txt <- The requirements file for reproducing the analysis environment
│── train.py <- script for training the model
│── test.py <- script for testing from a model
└── LICENSE <- Open-source license if one is chosen
Installation
Instructions on how to clone and set up your repository:
Clone this repo :
- Clone the repository and navigate to the project directory:
git clone https://github.com/MRUIL/WeakMCN.git
cd weakmcn
Create a conda virtual environment and activate it:
conda create -n weakmcn python=3.9 -y
conda activate weakmcn
Install the required dependencies:
- Install Pytorch following the offical installation instructions
(We run all our experiments on pytorch 1.11.0 with CUDA 11.3)
- Install apex following the official installation guide for more details.
(or use the following commands we copied from their offical repo)
git clone https://github.com/NVIDIA/apex
cd apex
git checkout origin/22.02-parallel-state
python setup.py install --cuda_ext --cpp_ext
pip3 install -v --no-cache-dir ./
- Clone the EfficientSAM repository
cd EfficientSAM
mkdir weights
cd weights
wget https://github.com/yformer/EfficientSAM/raw/refs/heads/main/weights/efficient_sam_vitt.pt
wget https://github.com/yformer/EfficientSAM/raw/refs/heads/main/weights/efficient_sam_vits.pt.zip
unzip efficient_sam_vits.pt.zip
cd ../..
Compile the DCN layer:
cd utils/DCN
./make.sh
Install remaining dependencies
pip install -r requirements.txt
pip install transformers==4.41.1
wget https://github.com/explosion/spacy-models/releases/download/en_vectors_web_lg-2.1.0/en_vectors_web_lg-2.1.0.tar.gz -O en_vectors_web_lg-2.1.0.tar.gz
pip install en_vectors_web_lg-2.1.0.tar.gz
Data Preparation
- Download images and Generate annotations according to SimREC
(We also prepared the annotations inside the data/anns folder for saving your time)
- Download the pretrained weights of YoloV3 from Google Drive
(We recommend to put it in the main path of WeakMCN otherwise, please modify the path in config files)
- The data directory should look like this:
├── data
│ ├── anns
│ ├── refcoco.json
│ ├── refcoco+.json
│ ├── refcocog.json
│ ├── images
│ ├── train2014
│ ├── COCO_train2014_000000515716.jpg
│ ├── ...
│ ├── masks
... the remaining directories
- NOTE: our YoloV3 is trained on COCO’s training images, excluding those in RefCOCO, RefCOCO+, and RefCOCOg’s validation+testing
Training
- If you want to train WeakMCN with SAM ViT-tiny backbone, you can run the following command:
python train.py --config ./config/refcoco_tuning.yaml
python train.py --config ./config/refcoco+_tuning.yaml
python train.py --config ./config/refcocog_tuning.yaml
- If you want to train WeakMCN with SAM ViT-base backbone, you can run the following command:
python train.py --config ./config/refcoco_tuning_v2.yaml
python train.py --config ./config/refcoco+_tuning_v2.yaml
python train.py --config ./config/refcocog_tuning_v2.yaml
Evaluation
python test.py --config ./config/[DATASET_NAME].yaml --eval-weights [PATH_TO_CHECKPOINT_FILE]
Model Zoo
Models trained on RefCOCO dataset
<table> <thead> <tr> <th rowspan="2">Method</th> <th colspan="3">REC</th> <th colspan="3">RES</th> <th rowspan="2">checkpoint</th> </tr> <tr> <th>val</th> <th>testA</th> <th>testB</th> <th>val</th> <th>testA</th> <th>testB</th> </tr> </thead> <tbody> <tr> <td>WeakMCN (SAM Vit-tiny)</td> <td>68.63</td> <td>70.18</td> <td>62.36</td> <td>58.41</td> <td>60.06</td> <td>56.08</td> <td><a href="https://drive.google.com/file/d/1z72KaIxCV_TQTo8sCh_Cr4p8qZ9y6OJA/view?usp=sharing">link</a></td> </tr> <tr> <td>WeakMCN (SAM Vit-base)</td> <td>69.22</td> <td>70.76</td> <td>63.43</td> <td>59.49</td> <td>61.01</td> <td>56.40</td> <td><a href="https://drive.google.com/file/d/1voXCHwbPCF7qlrP-CE360fd2G0qiklnu/view?usp=sharing">link</a></td> </tr> </tbody> </table>Models trained on RefCOCO+ dataset
<table> <thead> <tr> <th rowspan="2">Method</th> <th colspan="3">REC</th> <th colspan="3">RES</th> <th rowspan="2">checkpoint</th> </tr> <tr> <th>val</th> <th>testA</th> <th>testB</th> <th>val</th> <th>testA</th> <th>testB</th> </tr> </thead> <tbody> <tr> <td>WeakMCN (SAM Vit-tiny)</td> <td>51.14</td> <td>56.92</td> <td>42.22</td> <td>42.51</td> <td>48.91</td> <td>35.10</td> <td><a href="https://drive.google.com/file/d/12eCh4OdunSRRUYQR6307ZA4bG0P1f1II/view?usp=sharing">link</a></td> </tr> <tr> <td>WeakMCN (SAM Vit-base)</td> <td>51.93</td> <td>57.40</td> <td>43.28</td> <td>44.36</td> <td>50.40</td> <td>37.12</td> <td><a href="https://drive.google.com/file/d/1ht868vDoJjwQUg-qpbx1r6zcEHrEEJvb/view?usp=sharing">link</a></td> </tr> </tbody> </table>Models trained on RefCOCOg dataset
<table> <thead> <tr> <th rowspan="2">Method</th> <th colspan="1">REC</th> <th colspan="1">RES</th> <th rowspan="2">checkpoint</th> </tr> <tr> <th>val</th> <th>val</th> </tr> </thead> <tbody> <tr> <td>WeakMCN (SAM Vit-tiny)</td> <td>53.82</td> <td>45.73</td> <td><a href="https://drive.google.com/file/d/1-60icGj89kIwdIFBxDdA9uB4OpcYVCp-/view?usp=sharing">link</a></td> </tr> <tr> <td>WeakMCN (SAM Vit-base)</td> <td>55.00</td> <td>46.81</td> <td><a href="https://drive.google.com/file/d/1TgVbTlUBNssV4nFScIg2VZGk8wjyrPq_/view?usp=sharing">link</a></td> </tr> </tbody> </table>Acknowledgement
This repository is built upon RefCLIP, LaConvNet, and SimREC. Thanks for those well-organized codebases.
Citation
@inproceedings{cheng2025weakmcn,
title={WeakMCN: Multi-task Collaborative Network for Weakly Supervised Referring Expression Comprehension and Segmentation},
author={Cheng, Silin and Liu, Yang and He, Xinwei and Ourselin, Sebastien and Tan, Lei and Luo, Gen},
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
pages={9175--9185},
year={2025}
}
Related Skills
node-connect
344.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
99.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
344.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
344.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
