ModeX

[SCIS2024] The official implementation of paper "Modality-experts coordinated adaptation for large multimodal models", by Yan Zhang, Zhong Ji, Yanwei Pang, Jungong Han, Xuelong Li. It is built on top of the LAVIS in PyTorch. The paper link is there.

Getting Started

Follow the Instructions to create environment.

Dataset

The common vision-language datasets could be downloaded by automatic download tools, which could be employed to organize these datasets.

Then, modify the corresponding path in configs and the default.yaml.

Training

Runing the scripts in run_scripts for training and evaluation.

For more details and advanced usages, please refer to documentation.

Please use the following bib entry to cite this paper if you are using any resources from the repo.

@article{:/publisher/Science China Press/journal/SCIENCE CHINA Information Sciences/67/12/10.1007/s11432-024-4234-4,
  author = "Yan ZHANG,Zhong JI,Yanwei PANG,Jungong HAN,Xuelong LI",
  title = "Modality-experts coordinated adaptation for large multimodal models",
  journal = "SCIENCE CHINA Information Sciences",
  year = "2024",
  volume = "67",
  number = "12",
  pages = "220107-",
  url = "http://www.sciengine.com/publisher/Science China Press/journal/SCIENCE CHINA Information Sciences/67/12/10.1007/s11432-024-4234-4,
  doi = "https://doi.org/10.1007/s11432-024-4234-4"
}

Acknowledgement

Our codebase is built based on the popular LAVIS repository, which is under BSD 3-Clause License.

ModeX

Install / Use

README

ModeX

Getting Started

Dataset

Training

Please use the following bib entry to cite this paper if you are using any resources from the repo.

Acknowledgement