CMB Chinese-Medical-Benchmark

</center>

CMB

<p align="center"> 📃 <a href="https://arxiv.org/abs/2308.08833" target="_blank">Paper</a> • 🌐 <a href="https://cmedbenchmark.llmzoo.com/#home" target="_blank">Website</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/CMB" target="_blank">HuggingFace</a> <br> <a href="https://github.com/FreedomIntelligence/CMB/blob/main/README_zh.md"> 中文</a> | <a href="https://github.com/FreedomIntelligence/CMB/blob/main/README.md"> English </p>

🌈 Update

[2024.03.14] CMB was accepted into the 2024 NAACL Main Conference, thanking the academic community for its recognition.
[2024.02.21] The answers to the CMB-Exam test has been updated and some errors caused by omissions in version management have been fixed.
[2024.01.08] In order to facilitate testing, we disclose the answers to the CMB-Exam test
[2023.09.22] CMB is included in OpenCompass.
[2023.08.21] Paper released.
[2023.08.01] 🎉🎉🎉 CMB is published！🎉🎉🎉

🌐 Download Data

(Recommended) Download the zip file and unzip:

git clone "https://github.com/FreedomIntelligence/CMB.git" && cd CMB && unzip "./data/CMB.zip" -d "./data/" && rm "./data/CMB.zip"

Or Check out HuggingFace datasets to load our data as follows:

from datasets import load_dataset
# CMB-Exam datasets （multiple-choice and multiple-answer questions）
exam_datasets = load_dataset('FreedomIntelligence/CMB','exam')
# CMB-Clin datasets
clin_datasets = load_dataset('FreedomIntelligence/CMB','clin')

Or Check out Baidu Cloud

🥇 Leaderboard

Please Check Leaderboard.

🥸 Dataset intro

CMB

Components

CMB-Exam: Comprehensive multi-level assessment for medical knowledge
- Structure: 6 major categories and 28 subcategories, View Catalog
- CMB-test: 400 questions per subcategories, 11200 questions in total
- CMB-val: 280 questions with solutions and explanations; used as source for CoT and few-shot
- CMB-train: 269359 questions for medical knowledge injection
CMB-Clin: 74 cases of complex medical inquires

CMB-Exam Item

{
    "exam_type": "医师考试",
    "exam_class": "执业医师",
    "exam_subject": "口腔执业医师",
    "question": "患者，男性，11岁。近2个月来时有低热（37～38℃），全身无明显症状。查体无明显阳性体征。X线检查发现右肺中部有一直径约0.8cm类圆形病灶，边缘稍模糊，肺门淋巴结肿大。此男孩可能患",
    "answer": "D",
    "question_type": "单项选择题",
    "option": {
        "A": "小叶型肺炎",
        "B": "浸润性肺结核",
        "C": "继发性肺结核",
        "D": "原发性肺结核",
        "E": "粟粒型肺结核"
    }
},

exam_type: major category
exam_class: sub-category
exam_subject: Specific departments or subdivisions of disciplines
question_type: multiple-choice (单项选择题) or multiple-answer (多项选择题)

CMB-Clin Item

{
    "id": 0,
    "title": "案例分析-腹外疝",
    "description": "现病史\n（1）病史摘要\n     病人，男，49岁，3小时前解大便后出现右下腹疼痛，右下腹可触及一包块，既往体健。\n（2）主诉\n     右下腹痛并自扪及包块3小时。\n\n体格检查\n体温： T 37.8℃，P 101次／分，呼吸22次/分，BP 100/60mmHg，腹软，未见胃肠型蠕动波，肝脾肋下未及，于右侧腹股沟区可扪及一圆形肿块，约4cm×4cm大小，有压痛、界欠清，且肿块位于腹股沟韧带上内方。\n\n辅助检查\n（1）实验室检查\n     血常规：WBC 5.0×109／L，N 78％。\n     尿常规正常。\n（2）多普勒超声检查\n     沿腹股沟纵切可见一多层分布的混合回声区，宽窄不等，远端膨大，边界整齐，长约4～5cm。\n（3）腹部X线检查\n     可见阶梯状液气平。",
    "QA_pairs": [
        {
            "question": "简述该病人的诊断及诊断依据。",
            "solution": "诊断：嵌顿性腹股沟斜疝合并肠梗阻。\n诊断依据：\n①右下腹痛并自扪及包块3小时；\n②有腹胀、呕吐，类似肠梗阻表现；腹部平片可见阶梯状液平，考虑肠梗阻可能；腹部B超考虑，\n腹部包块内可能为肠管可能；\n③有轻度毒性反应或是中毒反应，如 T 37.8℃，P 101次／分，白细胞中性分类78％；\n④腹股沟区包块位于腹股沟韧带上内方。"
        },
        {
            "question": "简述该病人的鉴别诊断。",
            "solution": "（1）睾丸鞘膜积液：鞘膜积液所呈现的肿块完全局限在阴囊内，其上界可以清楚地摸到；用透光试验检查肿块，鞘膜积液多为透光（阳性），而疝块则不能透光。\n（2）交通性鞘膜积液：肿块的外形与睾丸鞘膜积液相似。于每日起床后或站立活动时肿块缓慢地出现并增大。平卧或睡觉后肿块逐渐缩小，挤压肿块，其体积也可逐渐缩小。透光试验为阳性。\n（3）精索鞘膜积液：肿块较小，在腹股沟管内，牵拉同侧睾丸可见肿块移动。\n（4）隐睾：腹股沟管内下降不全的睾丸可被误诊为斜疝或精索鞘膜积液。隐睾肿块较小，挤压时可出现特有的胀痛感觉。如患侧阴囊内睾丸缺如，则诊断更为明确。\n（5）急性肠梗阻：肠管被嵌顿的疝可伴发急性肠梗阻，但不应仅满足于肠梗阻的诊断而忽略疝的存在；尤其是病人比较肥胖或疝块较小时，更易发生这类问题而导致治疗上的错误。\n（6）此外，腹股沟区肿块还应与以下疾病鉴别:肿大的淋巴结、动（静）脉瘤、软组织肿瘤、脓肿、\n圆韧带囊肿、子宫内膜异位症等。"
        },
        {
            "question": "简述该病人的治疗原则。",
            "solution": "嵌顿性疝原则上需要紧急手术治疗，以防止疝内容物坏死并解除伴发的肠梗阻。术前应做好必要的准备，如有脱水和电解质紊乱，应迅速补液加以纠正。手术的关键在于正确判断疝内容物的活力，然后根据病情确定处理方法。在扩张或切开疝环、解除疝环压迫的前提下，凡肠管呈紫黑色，失去光泽和弹性，刺激后无蠕动和相应肠系膜内无动脉搏动者，即可判定为肠坏死。如肠管尚未坏死，则可将其送回腹腔，按一般易复性疝处理，即行疝囊高位结扎+疝修补术。如肠管确已坏死或一时不能肯定肠管是否已失去活力时，则应在病人全身情况允许的前提下，切除该段肠管并进行一期吻合。凡施行肠切除吻合术的病人，因手术区污染，在高位结扎疝囊后，一般不宜作疝修补术，以免因感染而致修补失败。"
        }
    ]
},

title: name of disease
description: information of patient
QA_pairs: a series of questions and their solutions based on the description

ℹ️ How to evaluate and submit

Modify model configuration file

<details><summary>Click to expand</summary>

configs/model_config.yaml：

my_model:
    model_id: 'my_model'
    load:
        # # HuggingFace model weights
        config_dir: "path/to/full/model"

        # # load with Peft
        # llama_dir: "path/to/base"
        # lora_dir: "path/to/lora"

        device: 'cuda'          # only support cuda
        precision: 'fp16'       # 

    # supports all parameters in transformers.GenerationConfig
    generation_config: 
        max_new_tokens: 512     
        min_new_tokens: 1          
        do_sample: False

</details>

Modify model worker

<details><summary>Click to expand</summary>

In workers/mymodel.py:

load model and tokenizer to cpu

def load_model_and_tokenizer(self, load_config):
     '''
     Params: 
         load_config: the `load` key in `configs/model_config.yaml`
     Returns:
         model, tokenizer: both on cpu
     '''
     hf_model_config = {"pretrained_model_name_or_path": load_config['config_dir'],'trust_remote_code': True, 'low_cpu_mem_usage': True}
     hf_tokenizer_config = {"pretrained_model_name_or_path": load_config['config_dir'], 'padding_side': 'left', 'trust_remote_code': True}
     precision = load_config.get('precision', 'fp16')
     device = load_config.get('device', 'cuda')

     if precision == 'fp16':
         hf_model_config.update({"torch_dtype": torch.float16})

     model = AutoModelForCausalLM.from_pretrained(**hf_model_config)
     tokenizer = AutoTokenizer.from_pretrained(**hf_tokenizer_config)

     model.eval()
     return model, tokenizer # cpu

system prompt

@property
def system_prompt(self):
    '''
    The prompt that is prepended to every input.
    '''
    return "你是一个人工智能助手。"

instruction template

@property
def instruction_template(self):
    '''
    The template for instruction input. An '{instruction}' placeholder must be contained.
    '''
    return self.system_prompt + '问：{instruction}\n答：'

instruction template with fewshot examples

@property
def instruction_template_with_fewshot(self,):
    '''
    The template for instruction input. There must be an '{instruction}' placeholder in this template.
    '''
    return self.system_prompt + '{fewshot_examples}问：{instruction}\n答：'  # 必须带有 {instruction} 和 {fewshot_examples} 的placeholder

template for each fewshot example

@property
def fewshot_template(self):
    '''
    The template for each fewshot example. Each fewshot example is concatenated and put in the `{fewshot_examples}` placeholder above.
    There must be a `{user}` and `{gpt}` placeholder in this template.
    '''
    return "问：{user}\n答：{gpt}\n" # 必须带有 {user} 和 {gpt} 的placeholder

</details>

Modify /src/constants.py

<details><summary>Click to expand</summary>

from workers.mymodel import MyModelWorker # modify here
id2worker_class = {
"my_model": MyModelWorker,  # modify here
}

</details>

Generate fewshot examples (required if using fewshot)

<details><summary>Click to expand</summary>

Modify generate_fewshot.sh:

model_id="baichuan-13b-chat"
n_shot=3

test_path=data/CMB-Exam/CMB-test/CMB-test-choice-question-merge.json 
val_path=data/CMB-Exam/CMB-val/CMB-val-merge.json
output_dir=data/fewshot
python ./src/generate_fewshot.py \
--use_cot \                     # whether to use CoT template
--n_shot=$n_shot \
--model_id=$model_id \
--output_dir=$output_dir  \
--val_path=$val_path \
--test_path=$test_path

and run:

bash generate_fewshot.sh

</details>

Modify the main script

<details><summary>Click to expand</summary>

generate_answers.sh:

# # input file path
# data_path='data/CMB-Exam/CMB-test/CMB-test-choice-question-merge.json'   
# data_path='data/CMB-Clin/CMB-Clin-qa.json'                            

task_name='Zero-test-cot'   
port_id=27272

model_id="my_model"                                                      # the same as in `configs/model_config.yaml` 

accelerate launch \
    --gpu_ids='all' \                                                   
    --main_process_port 12345 \                                      
    --config_file ./configs/accelerate_config.yaml  \                   # /path/to/accelerate_config
    ./src/generate_answers.py \                                         # main program
    --model_id=$model_id \

CMB

Install / Use

README

CMB Chinese-Medical-Benchmark

🌈 Update

🌐 Download Data

🥇 Leaderboard

🥸 Dataset intro

Components

CMB-Exam Item

CMB-Clin Item

ℹ️ How to evaluate and submit

Modify model configuration file

Modify model worker

Modify /src/constants.py

Generate fewshot examples (required if using fewshot)

Modify the main script

Related Skills