Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models

This repository is an official repository of the paper "Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models".

Leaderboard

Please visit our leaderboard website.

Development set

In order to maintain the integrity and the blindness of the evaluation benchmark for MedMLLMs, we have partitioned Asclepius into development and test subsets. The development subset is entirely accessible to the public, including the ground truth answers for all incorporated questions. Conversely, the test subset is only partially disclosed, with data samples being publicly available while the ground truth answers are withheld. To ascertain the performance on the test subset, participants are required to submit their predictions to the Asclepius server, which ensures an unbiased assessment of the Med-MLLMs.

The development set can be found at https://drive.google.com/file/d/1bzCZ35s3F8BEVspklML1L71xaoq4TDKA/view?usp=sharing

Citation

Please cite our paper!

@article{wang2024asclepius, title={Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models}, author={Wang, Wenxuan and Su, Yihang and Huan, Jingyuan and Liu, Jie and Chen, Wenting and Zhang, Yudi and Li, Cheng-Yi and Chang, Kao-Jung and Xin, Xiaohan and Shen, Linlin and others}, journal={arXiv preprint arXiv:2402.11217}, year={2024} }

Contact

If you have any question, please feel free to contact us.

Asclepius

Install / Use

README

Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models

Leaderboard

Development set

Citation

Contact