SumQE
SUM-QE, a BERT-based Summary Quality Estimation Model
Install / Use
/learn @nlpaueb/SumQEREADME
SumQE
This is the source code for SUM-QE, a BERT-based Summary Quality Estimation Model. If you use the code for your research, please cite the following paper.
Stratos Xenouleas, Prodromos Malakasiotis, Marianna Apidianaki and Ion Androutsopoulos (2019), SUM-QE: a BERT-based Summary Quality Estimation Model. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019), November 3-7, Hong Kong.
@inproceedings{Xenouleas:EMNLP-IJCNLP19,
author = {Xenouleas, Stratos and Malakasiotis, Prodromos and Apidianaki, Marianna and Androutsopoulos, Ion},
title = {{SUM-QE: a BERT-based Summary Quality Estimation Model}},
fullbooktitle = {Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019)},
booktitle = {EMNLP-IJCNLP 2019},
year = {2019},
address = {Hong Kong}
}
A preprint of the paper is available on arXiv.
Demo
A simple example (similar to scr/examples.py) of how to load and run the trained models can be found in this Google Colab notebook.
Environment
In our experiments we used Anaconda and python=3.6. You can set the environment using the following steps:
-
Create a conda environment and activate it:
conda create -p SUM_QE_env python=3.6 --no-default-packages conda activate SUM_QE_env/ -
Install tensorflow=1.12:
conda install tensorflow-gpu==1.12 -
Install the remaining requirements
pip install -r requirements.txt -
Add
SumQE/to the path of the environment:conda develop SumQE/ -
Download nltk punkt sentence tokenizer:
python -m nltk.downloader punkt -
Download the binary file containing the pretrained GloVe embeddings from here and move it to
SumQE/inputdirectory. -
In order to calculate the ROUGE and BLEU scores, clone this repository and add it to the environment:
cd py-rouge pip install . -
Copy by hand the file
py-rouge/rouge/smart_common_words.txtinto the directory:SUM_QE_env/lib/python3.6/site-packages/rouge
Datasets
We used the datasets from the DUC-05, DUC-06 and DUC-07 shared tasks. To ease processing, we constructed
a json file for each year (duc_year.json) with all the necessary information organized. In particular, each file contains the peer (system) and model (human) summaries followed by the human scores
(the scores given by the annotators), and the ROUGE and BLEU scores that were calculated automatically.
In order to construct your own datasets, you need to follow the structure shown below for each particular year. Make sure that your files are renamed correctly.
project
| ...
└──input
└── DUC_2005
| linguistic_quality.table
| pyramid.txt
| Responsiveness.table
|
└── summaries
| D301.M.250.I.1
| ...
└── DUC_2006
...
└── DUC_2007
...
If you have created the above structure correctly, the next step is to construct the datasets with the following command:
python src/make_datasets/make_dataset.py
Pre-trained Models
All the models that used on the paper can be found here. For each dataset (DUC-05, DUC-06, DUC-07) there are 15 models available. 15 extra models were added, trained in all of the 3 DUC datasets.
<table> <tr><th colspan="4" align="center"> Models </ th></tr> <tr><th colspan="4" align="center"> Train years </ th></tr> <tr><th align="center"> 2006 + 2007 </ th><th align="center"> 2005 + 2007 </ th><th align="center"> 2005 + 2006 </ th><th align="center"> 2005 + 2006 + 2007 </ th></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q1_Single%20Task.h5"> BERT_Q1 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q1_Single%20Task.h5"> BERT_Q1 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q1_Single%20Task.h5">BERT_Q1 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q1_Single%20Task.h5">BERT_Q1 (Single Task) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q2_Single%20Task.h5"> BERT_Q2 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q2_Single%20Task.h5"> BERT_Q2 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q2_Single%20Task.h5">BERT_Q2 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q2_Single%20Task.h5">BERT_Q2 (Single Task) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q3_Single%20Task.h5"> BERT_Q3 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q3_Single%20Task.h5"> BERT_Q3 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q3_Single%20Task.h5">BERT_Q3 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q3_Single%20Task.h5">BERT_Q3 (Single Task) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q4_Single%20Task.h5"> BERT_Q4 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q4_Single%20Task.h5"> BERT_Q4 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q4_Single%20Task.h5">BERT_Q4 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q4_Single%20Task.h5">BERT_Q4 (Single Task) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q5_Single%20Task.h5"> BERT_Q5 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q5_Single%20Task.h5"> BERT_Q5 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q5_Single%20Task.h5">BERT_Q5 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q5_Single%20Task.h5">BERT_Q5 (Single Task) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q1_Multi%20Task-1.h5"> BERT_Q1 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q1_Multi%20Task-1.h5"> BERT_Q1 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q1_Multi%20Task-1.h5">BERT_Q1 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q1_Multi%20Task-1.h5">BERT_Q1 (Multi Task-1) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q2_Multi%20Task-1.h5"> BERT_Q2 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q2_Multi%20Task-1.h5"> BERT_Q2 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q2_Multi%20Task-1.h5">BERT_Q2 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q2_Multi%20Task-1.h5">BERT_Q2 (Multi Task-1) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q3_Multi%20Task-1.h5"> BERT_Q3 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q3_Multi%20Task-1.h5"> BERT_Q3 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q3_Multi%20Task-1.h5">BERT_Q3 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q3_Multi%20Task-1.h5">BERT_Q3 (Multi Task-1) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q4_Multi%20Task-1.h5"> BERT_Q4 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q4_Multi%20Task-1.h5"> BERT_Q4 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q4_Multi%20Task-1.h5">BERT_Q4 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q4_Multi%20Task-1.h5">BERT_Q4 (Multi Task-1) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q5_Multi%20Task-1.h5"> BERT_Q5 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q5_Multi%20Task-1.h5"> BERT_Q5 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q5_Multi%20Task-1.h5">BERT_Q5 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q5_Multi%20Task-1.h5">BERT_Q5 (Multi Task-1) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q1_Multi%20Task-5.h5"> BERT_Q1 (Multi Task-5)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q1_Multi%20Task-5.h5"> BERT_Q1 (Multi Task-5)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q1_Multi%20Task-5.h5">BERT_Q1 (Multi Task-5)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q1_Multi%20Task-5.h5">BERT_Q1 (Multi Task-5) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q2_Multi%20Task-5.h5"> BERT_Q2 (Multi Task-5)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q2_Multi%20Task-5.h5"> BERT_Q2 (Multi Task-5)</a></td><td><a href="https://archive.org/do