VProChart
No description available
Install / Use
/learn @MuyeHuang/VProChartREADME
VProChart: Answering Chart Question through Visual Perception Alignment Agent and Programmatic Solution Reasoning
🎉 News
- 2024.12.09: Our paper has been accepted by AAAI-25.
🚀 Quick Start
📥 Model Download
Click the badge below to grab the pretrained VPAgent model from ModelScope:
🛠️ Dependencies
Install the exact versions for full compatibility:
pip install transformers==4.28.1 \
pytorch-lightning==1.8.5 \
datasets \
sentencepiece
🎯 Usage (Inference)
For a quick example of loading and querying VProChart, see our test script:
Programmatic Solution Reasoning is still being organized and will be released soon.
🔧 Finetuning
To adapt VProChart on your own data, consult the finetuning script:
Example CLI
python finetune_chartqa.py \
--data-path "your_hf_dataset_name_or_local_path" \
--train-images "/path/to/train/images/" \
--valid-images "/path/to/val/images/" \
--output-dir "./finetuned_model_output/" \
--max-steps 40000 \
--batch-size 8 \
--valid-batch-size 1 \
--max-length 512 \
--num-workers 12 \
--lr 5e-5 \
--check-val-every-n-epoch 1 \
--log-every-n-steps 50 \
--warmup-steps 100 \
--checkpoint-steps 7000 \
--gradient-clip-val 1.0 \
--accumulate-grad-batches 1 \
--gpus-num 1 \
--nodes-num 1 \
--checkpoint-path "/path/to/vprochart_pretrained/"
📖 Citation
If you find VProChart useful in your research, please cite:
@misc{huang2024vprochartansweringchartquestion,
title = {VProChart: Answering Chart Question through Visual Perception Alignment Agent and Programmatic Solution Reasoning},
author = {Muye Huang and Lingling Zhang and Lai Han and Wenjun Wu and Xinyu Zhang and Jun Liu},
year = {2024},
eprint = {2409.01667},
archivePrefix = {arXiv},
primaryClass = {cs.CV},
url = {https://arxiv.org/abs/2409.01667},
}
🔗 Resources
- Paper: VProChart: Answering Chart Question through Visual Perception Alignment Agent and Programmatic Solution Reasoning
- ModelScope Hub: VProChart-VPAgent
🙏 Acknowledgements
This work is partially based on the UniChart project: vis-nlp/UniChart
