SkillAgentSearch skills...

MemVP

[ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning

Install / Use

/learn @JieShibo/MemVP
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

MemVP

Official code of ''Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning''

<p align="left"> <a href="https://arxiv.org/abs/2405.05615" alt="arXiv"> <img src="https://img.shields.io/badge/arXiv-2405.05615-b31b1b.svg?style=flat" /></a> </p> <p align="center"> <img src="./figs/fig1.png" width="700"> </p>

Environment

conda create -n memvp python==3.10
conda activate memvp
pip install -r requirements.txt
pip install -e .

TODO

  • [x] Code of experiments on LLaMA.
  • [ ] Code of experiments on BART and T5.

Preparation

<your path>/
  |-- memvp
  |-- scripts
  |-- train.py
  |-- eval.py
  ......
  |-- data/
      |-- problem.json
      |-- pid_splits.json
      |-- captions.json
      |-- images
          |-- train          # ScienceQA train image
          |-- val            # ScienceQA val image
          |-- test           # ScienceQA test image
      |-- weights
          |-- tokenizer.model
              |--7B
                  |-- params.json
                  |-- consolidated.00.pth
              |--13B
                  |-- params.json
                  |-- consolidated.00.pth
                  |-- consolidated.01.pth

Fine-Tuning & Inference

# LLaMA-7B
bash scripts/finetuning_sqa_7b.sh
bash scripts/eval_sqa_7b.sh

# LLaMA-13B
bash scripts/finetuning_sqa_13b.sh
bash scripts/eval_sqa_13b.sh

Fine-tuning takes around 40 minutes for LLaMA-7B and 1 hour for LLaMA-13B on 8x A800 (80G).

Checkpoints

Acknowledgements

Citation

@article{jie2024memvp,
  title={Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning},
  author={Jie, Shibo and Tang, Yehui and Ding, Ning and Deng, Zhi-Hong and Han, Kai and Wang, Yunhe},
  journal={arXiv preprint arXiv:2405.05615},
  year={2024}
}
View on GitHub
GitHub Stars50
CategoryDevelopment
Updated7mo ago
Forks5

Languages

Python

Security Score

72/100

Audited on Aug 14, 2025

No findings