CMB
CMB, A Comprehensive Medical Benchmark in Chinese
Install / Use
/learn @FreedomIntelligence/CMBREADME
CMB Chinese-Medical-Benchmark
<center>

🌈 Update
- [2024.03.14] CMB was accepted into the 2024 NAACL Main Conference, thanking the academic community for its recognition.
- [2024.02.21] The answers to the CMB-Exam test has been updated and some errors caused by omissions in version management have been fixed.
- [2024.01.08] In order to facilitate testing, we disclose the answers to the CMB-Exam test
- [2023.09.22] CMB is included in OpenCompass.
- [2023.08.21] Paper released.
- [2023.08.01] 🎉🎉🎉 CMB is published!🎉🎉🎉
🌐 Download Data
- (Recommended) Download the zip file and unzip:
git clone "https://github.com/FreedomIntelligence/CMB.git" && cd CMB && unzip "./data/CMB.zip" -d "./data/" && rm "./data/CMB.zip" - Or Check out HuggingFace datasets to load our data as follows:
from datasets import load_dataset # CMB-Exam datasets (multiple-choice and multiple-answer questions) exam_datasets = load_dataset('FreedomIntelligence/CMB','exam') # CMB-Clin datasets clin_datasets = load_dataset('FreedomIntelligence/CMB','clin') - Or Check out Baidu Cloud
🥇 Leaderboard
Please Check Leaderboard.
🥸 Dataset intro
Components
-
CMB-Exam: Comprehensive multi-level assessment for medical knowledge
- Structure: 6 major categories and 28 subcategories, View Catalog
- CMB-test: 400 questions per subcategories, 11200 questions in total
- CMB-val: 280 questions with solutions and explanations; used as source for CoT and few-shot
- CMB-train: 269359 questions for medical knowledge injection
-
CMB-Clin: 74 cases of complex medical inquires
CMB-Exam Item
{
"exam_type": "医师考试",
"exam_class": "执业医师",
"exam_subject": "口腔执业医师",
"question": "患者,男性,11岁。近2个月来时有低热(37~38℃),全身无明显症状。查体无明显阳性体征。X线检查发现右肺中部有一直径约0.8cm类圆形病灶,边缘稍模糊,肺门淋巴结肿大。此男孩可能患",
"answer": "D",
"question_type": "单项选择题",
"option": {
"A": "小叶型肺炎",
"B": "浸润性肺结核",
"C": "继发性肺结核",
"D": "原发性肺结核",
"E": "粟粒型肺结核"
}
},
- exam_type: major category
- exam_class: sub-category
- exam_subject: Specific departments or subdivisions of disciplines
- question_type: multiple-choice (单项选择题) or multiple-answer (多项选择题)
CMB-Clin Item
{
"id": 0,
"title": "案例分析-腹外疝",
"description": "现病史\n(1)病史摘要\n 病人,男,49岁,3小时前解大便后出现右下腹疼痛,右下腹可触及一包块,既往体健。\n(2)主诉\n 右下腹痛并自扪及包块3小时。\n\n体格检查\n体温: T 37.8℃,P 101次/分,呼吸22次/分,BP 100/60mmHg,腹软,未见胃肠型蠕动波,肝脾肋下未及,于右侧腹股沟区可扪及一圆形肿块,约4cm×4cm大小,有压痛、界欠清,且肿块位于腹股沟韧带上内方。\n\n辅助检查\n(1)实验室检查\n 血常规:WBC 5.0×109/L,N 78%。\n 尿常规正常。\n(2)多普勒超声检查\n 沿腹股沟纵切可见一多层分布的混合回声区,宽窄不等,远端膨大,边界整齐,长约4~5cm。\n(3)腹部X线检查\n 可见阶梯状液气平。",
"QA_pairs": [
{
"question": "简述该病人的诊断及诊断依据。",
"solution": "诊断:嵌顿性腹股沟斜疝合并肠梗阻。\n诊断依据:\n①右下腹痛并自扪及包块3小时;\n②有腹胀、呕吐,类似肠梗阻表现;腹部平片可见阶梯状液平,考虑肠梗阻可能;腹部B超考虑,\n腹部包块内可能为肠管可能;\n③有轻度毒性反应或是中毒反应,如 T 37.8℃,P 101次/分,白细胞中性分类78%;\n④腹股沟区包块位于腹股沟韧带上内方。"
},
{
"question": "简述该病人的鉴别诊断。",
"solution": "(1)睾丸鞘膜积液:鞘膜积液所呈现的肿块完全局限在阴囊内,其上界可以清楚地摸到;用透光试验检查肿块,鞘膜积液多为透光(阳性),而疝块则不能透光。\n(2)交通性鞘膜积液:肿块的外形与睾丸鞘膜积液相似。于每日起床后或站立活动时肿块缓慢地出现并增大。平卧或睡觉后肿块逐渐缩小,挤压肿块,其体积也可逐渐缩小。透光试验为阳性。\n(3)精索鞘膜积液:肿块较小,在腹股沟管内,牵拉同侧睾丸可见肿块移动。\n(4)隐睾:腹股沟管内下降不全的睾丸可被误诊为斜疝或精索鞘膜积液。隐睾肿块较小,挤压时可出现特有的胀痛感觉。如患侧阴囊内睾丸缺如,则诊断更为明确。\n(5)急性肠梗阻:肠管被嵌顿的疝可伴发急性肠梗阻,但不应仅满足于肠梗阻的诊断而忽略疝的存在;尤其是病人比较肥胖或疝块较小时,更易发生这类问题而导致治疗上的错误。\n(6)此外,腹股沟区肿块还应与以下疾病鉴别:肿大的淋巴结、动(静)脉瘤、软组织肿瘤、脓肿、\n圆韧带囊肿、子宫内膜异位症等。"
},
{
"question": "简述该病人的治疗原则。",
"solution": "嵌顿性疝原则上需要紧急手术治疗,以防止疝内容物坏死并解除伴发的肠梗阻。术前应做好必要的准备,如有脱水和电解质紊乱,应迅速补液加以纠正。手术的关键在于正确判断疝内容物的活力,然后根据病情确定处理方法。在扩张或切开疝环、解除疝环压迫的前提下,凡肠管呈紫黑色,失去光泽和弹性,刺激后无蠕动和相应肠系膜内无动脉搏动者,即可判定为肠坏死。如肠管尚未坏死,则可将其送回腹腔,按一般易复性疝处理,即行疝囊高位结扎+疝修补术。如肠管确已坏死或一时不能肯定肠管是否已失去活力时,则应在病人全身情况允许的前提下,切除该段肠管并进行一期吻合。凡施行肠切除吻合术的病人,因手术区污染,在高位结扎疝囊后,一般不宜作疝修补术,以免因感染而致修补失败。"
}
]
},
- title: name of disease
- description: information of patient
- QA_pairs: a series of questions and their solutions based on the description
ℹ️ How to evaluate and submit
Modify model configuration file
<details><summary>Click to expand</summary>configs/model_config.yaml:
my_model:
model_id: 'my_model'
load:
# # HuggingFace model weights
config_dir: "path/to/full/model"
# # load with Peft
# llama_dir: "path/to/base"
# lora_dir: "path/to/lora"
device: 'cuda' # only support cuda
precision: 'fp16' #
# supports all parameters in transformers.GenerationConfig
generation_config:
max_new_tokens: 512
min_new_tokens: 1
do_sample: False
</details>
Modify model worker
<details><summary>Click to expand</summary>In workers/mymodel.py:
-
load model and tokenizer to cpu
def load_model_and_tokenizer(self, load_config): ''' Params: load_config: the `load` key in `configs/model_config.yaml` Returns: model, tokenizer: both on cpu ''' hf_model_config = {"pretrained_model_name_or_path": load_config['config_dir'],'trust_remote_code': True, 'low_cpu_mem_usage': True} hf_tokenizer_config = {"pretrained_model_name_or_path": load_config['config_dir'], 'padding_side': 'left', 'trust_remote_code': True} precision = load_config.get('precision', 'fp16') device = load_config.get('device', 'cuda') if precision == 'fp16': hf_model_config.update({"torch_dtype": torch.float16}) model = AutoModelForCausalLM.from_pretrained(**hf_model_config) tokenizer = AutoTokenizer.from_pretrained(**hf_tokenizer_config) model.eval() return model, tokenizer # cpu -
system prompt
@property def system_prompt(self): ''' The prompt that is prepended to every input. ''' return "你是一个人工智能助手。" -
instruction template
@property def instruction_template(self): ''' The template for instruction input. An '{instruction}' placeholder must be contained. ''' return self.system_prompt + '问:{instruction}\n答:' -
instruction template with fewshot examples
@property def instruction_template_with_fewshot(self,): ''' The template for instruction input. There must be an '{instruction}' placeholder in this template. ''' return self.system_prompt + '{fewshot_examples}问:{instruction}\n答:' # 必须带有 {instruction} 和 {fewshot_examples} 的placeholder -
template for each fewshot example
@property def fewshot_template(self): ''' The template for each fewshot example. Each fewshot example is concatenated and put in the `{fewshot_examples}` placeholder above. There must be a `{user}` and `{gpt}` placeholder in this template. ''' return "问:{user}\n答:{gpt}\n" # 必须带有 {user} 和 {gpt} 的placeholder
Modify /src/constants.py
<details><summary>Click to expand</summary>from workers.mymodel import MyModelWorker # modify here
id2worker_class = {
"my_model": MyModelWorker, # modify here
}
</details>
Generate fewshot examples (required if using fewshot)
<details><summary>Click to expand</summary>Modify generate_fewshot.sh:
model_id="baichuan-13b-chat"
n_shot=3
test_path=data/CMB-Exam/CMB-test/CMB-test-choice-question-merge.json
val_path=data/CMB-Exam/CMB-val/CMB-val-merge.json
output_dir=data/fewshot
python ./src/generate_fewshot.py \
--use_cot \ # whether to use CoT template
--n_shot=$n_shot \
--model_id=$model_id \
--output_dir=$output_dir \
--val_path=$val_path \
--test_path=$test_path
and run:
bash generate_fewshot.sh
</details>
Modify the main script
<details><summary>Click to expand</summary>generate_answers.sh:
# # input file path
# data_path='data/CMB-Exam/CMB-test/CMB-test-choice-question-merge.json'
# data_path='data/CMB-Clin/CMB-Clin-qa.json'
task_name='Zero-test-cot'
port_id=27272
model_id="my_model" # the same as in `configs/model_config.yaml`
accelerate launch \
--gpu_ids='all' \
--main_process_port 12345 \
--config_file ./configs/accelerate_config.yaml \ # /path/to/accelerate_config
./src/generate_answers.py \ # main program
--model_id=$model_id \
