Prompts4Keras
Prompt-learning methods used BERT4Keras (PET, EFL and NSP-BERT), both for Chinese and English.
Install / Use
/learn @sunyilgdx/Prompts4KerasREADME
Prompts4Keras
Run the experiments in our paper "NSP-BERT: A Prompt-based Few-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction" Prompt-learning methods are used BERT4Keras (PET, EFL and NSP-BERT), both for Chinese and English.
Overview
In order to better compare NSP-BERT and other two basic prompt-learning methods based on MLM and NLI in Chinese and English two languages, and can easily conduct experiments on the BERT4Keras framework, especially transfer the original English RoBERTa model to the BERT4Keras framework, we developed this repository.
Target
Mainly for text classification tasks in zero-shot and few-shot learning scenarios.
Supported Methods
Supported Models
-
BERT for both English and Chinese, and BERT-like Chinese RoBERTa, such as vanilla BERT, HFL Chinese-BERT-wwm, UER Chinese-BERT.
-
English-RoBERTa proposed by Fairseq.
NOTE: We need to use scripts in ./tools/... to convert the pytorch model to the tensorflow model we used.
Environments
Different from the baselines, this repository all uses the BERT4Keras framework, which is completely based on tensorflow.
Since it needs to run on a graphics card of the Ampere framework (such as A100, RTX 3090), we need to install the NVIDIA version of tensorflow.
bert4keras==0.10.8
fairseq==0.10.2
keras==2.6.0
nvidia_tensorflow==1.15.4+nv20.11
scikit_learn==1.0.2
scipy==1.3.1
torch==1.7.0
transformers==4.12.3
Datasets
FewCLUE datasets can be downloaded here
English datasets should download by here and use the script generate_k_shot_data.py.
Yahoo! and AGNews should use the script, too.
Reproduce experiments
- Downloading the models
-
For all English tasks we use vanilla BERT-Large, cased
-
For all the Chinese tasks we use UER-BERT-Base (MixedCorpus+BertEncoder(base)+BertTarget)
-
For PET, we can choice English-RoBERTa-Large proposed by Fairseq, and English-RoBERTa-Large wiki+books
-
For EFL, we need to use model trained on NLI dataset, such as English RoBERTa-Large-MNLI, or Chinese-BERT-base-OCNLI (we trained by ourselves on OCNLI by
./tools/cls_nli_bert.py)
- Convert pytorch models to tf
convert_fairseq_roberta_to_tf.pyconvert_bert_from_uer_to_original_tf.py
- Using
run_experiment.shorrun_nsp_bert.shand other scripts to reproduce our experiments. For each few-shot learning task, we divide the training set and dev set according to 5 random seeds, and conduct experiments separately.
- English tasks
dataset_name:
SST-2,MR,CR,Subj,MPQA,Yahoo!,AGNews.
Models Pre-trained by Ourselves
BERT-Large-Mix5-5M
Link:https://share.weiyun.com/MXroU3g1 Code:2ebdf4
https://huggingface.co/sunyilgdx/mixr/tree/main
Scripts
for i in 1 2 3 4 5
do
python ./nsp_bert/nsp_classification.py \
--method few-shot \
--n_th_set $i \
--device 0 \
--dataset_name SST-2 \
--batch_size 8 \
--learning_rate 2e-5 \
--loss_function BCE \
--model_name bert_large
done
- Chinese tasks
dataset_name:
EPRSTMT,TNEWS,CSLDCP,IFLYTEK.
for i in 1 2 3 4 5
do
python ./nsp_bert/nsp_classification.py \
--method few-shot \
--n_th_set $i \
--device 0 \
--dataset_name EPRSTMT \
--batch_size 8 \
--learning_rate 1e-5 \
--loss_function BCE \
--model_name chinese_bert_base
done
Citation
@inproceedings{sun-etal-2022-nsp,
title = "{NSP}-{BERT}: A Prompt-based Few-Shot Learner through an Original Pre-training Task {---}{---} Next Sentence Prediction",
author = "Sun, Yi and
Zheng, Yu and
Hao, Chao and
Qiu, Hangping",
booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
month = oct,
year = "2022",
address = "Gyeongju, Republic of Korea",
publisher = "International Committee on Computational Linguistics",
url = "https://aclanthology.org/2022.coling-1.286",
pages = "3233--3250"
}
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
research_rules
Research & Verification Rules Quote Verification Protocol Primary Task "Make sure that the quote is relevant to the chapter and so you we want to make sure that we want to have it identifie
