ModeX
ModeX,Modality-experts coordinated adaptation for large multimodal models,SCIS,2024
Install / Use
/learn @zhangy0822/ModeXREADME
ModeX
[SCIS2024] The official implementation of paper "Modality-experts coordinated adaptation for large multimodal models", by Yan Zhang, Zhong Ji, Yanwei Pang, Jungong Han, Xuelong Li. It is built on top of the LAVIS in PyTorch. The paper link is there.
<img src="docs/_static/trade-off.png" width="700">Getting Started
Follow the Instructions to create environment.
Dataset
The common vision-language datasets could be downloaded by automatic download tools, which could be employed to organize these datasets.
Then, modify the corresponding path in configs and the default.yaml.
Training
Runing the scripts in run_scripts for training and evaluation.
For more details and advanced usages, please refer to documentation.
Please use the following bib entry to cite this paper if you are using any resources from the repo.
@article{:/publisher/Science China Press/journal/SCIENCE CHINA Information Sciences/67/12/10.1007/s11432-024-4234-4,
author = "Yan ZHANG,Zhong JI,Yanwei PANG,Jungong HAN,Xuelong LI",
title = "Modality-experts coordinated adaptation for large multimodal models",
journal = "SCIENCE CHINA Information Sciences",
year = "2024",
volume = "67",
number = "12",
pages = "220107-",
url = "http://www.sciengine.com/publisher/Science China Press/journal/SCIENCE CHINA Information Sciences/67/12/10.1007/s11432-024-4234-4,
doi = "https://doi.org/10.1007/s11432-024-4234-4"
}
Acknowledgement
Our codebase is built based on the popular LAVIS repository, which is under BSD 3-Clause License.
