SkillAgentSearch skills...

IMAX

Official PyTorch codes for Imaginatively-connected Embedding in Complex Space for Unseen Attribute-Object Discrimination (TPAMI 2024)

Install / Use

/learn @LanchJL/IMAX
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Imaginary-Connected Embedding in Complex Space for Unseen Attribute-Object Discrimination

News

[2025.4] We uploaded the test code as well as our trained weights to facilitate testing directly based on the existing weights.

[2025.1] To enhance readability, we restructured the IMAX code based on Troika and introduced an Adapter module to facilitate the adaptation of the CLIP visual encoder to the test dataset. We found that this improvement further enhances IMAX's performance across three benchmark datasets.

[2025.1] We have completed the open-sourcing of the CLIP-based IMAX code, and the remaining encoder implementations will be uploaded shortly.

Setup

conda create --name imax python=3.8
conda activate imax
conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.1 -c pytorch -c nvidia
pip3 install git+https://github.com/openai/CLIP.git

The remaining dependencies can be found in the ./requirements.txt file and installed using pip install -r requirements.txt.

<!-- DINO pretrained VIT-B-16 can be found [Here](https://dl.fbaipublicfiles.com/dino/dino_vitbase16_pretrain/dino_vitbase16_pretrain.pth). Please place the downloaded file in the `./pretrain/` -->

The CLIP weights can be downloaded via CLIP and should be placed in the ./clip_modules/ directory.

Datasets

The splits of dataset and its attributes can be found in ./utils/download_data.sh, the complete installation process can be found in CGE&CompCos. You can download the datasets using

bash ./utils/download_data.sh

Train

If you wish to try training our model from scratch, for example, for UT-Zappos:

python -u train.py \
--clip_arch ./clip_modules/ViT-L-14.pt \
--dataset_path <path_to_UT-Zap50k> \
--save_path <path_to_logs> \
--yml_path ./config/ut-zappos.yml \
--num_workers 4 \
--seed 0 \
--adapter

MIT-States:

python -u train.py \
--clip_arch ./clip_modules/ViT-L-14.pt \
--dataset_path <path_to_MIT-States> \
--save_path <path_to_logs> \
--yml_path ./config/mit-states.yml \
--num_workers 2 \
--seed 0 \
--adapter

C-GQA:

python -u train.py \
--clip_arch ./clip_modules/ViT-L-14.pt \
--dataset_path <path_to_C-GQA> \
--save_path <path_to_logs> \
--yml_path ./config/cgqa.yml \
--num_workers 2 \
--seed 0 \
--adapter

Or you can run sh files in ./run/, for example:

sh ./run/utzappos.sh

Test

Our trained weights can be found in here. If you want to test that the trained model is based on existing weights, run the following command:

python -u test.py \
--clip_arch ./clip_modules/ViT-L-14.pt \
--dataset_path <path_to_CG-QA> \
--yml_path <path_to_yml> \
--num_workers 4 \
--seed 0 \
--adapter \
--load_model <path_to_weights>

Acknowledgement

The code we publish is based on the following outstanding repositories, which have helped us a lot

Related Skills

View on GitHub
GitHub Stars4
CategoryDevelopment
Updated2mo ago
Forks0

Languages

Python

Security Score

70/100

Audited on Jan 30, 2026

No findings