SumQE

This is the source code for SUM-QE, a BERT-based Summary Quality Estimation Model. If you use the code for your research, please cite the following paper.

Stratos Xenouleas, Prodromos Malakasiotis, Marianna Apidianaki and Ion Androutsopoulos (2019), SUM-QE: a BERT-based Summary Quality Estimation Model. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019), November 3-7, Hong Kong.

@inproceedings{Xenouleas:EMNLP-IJCNLP19,
               author        = {Xenouleas, Stratos and Malakasiotis, Prodromos and Apidianaki, Marianna and Androutsopoulos, Ion},
               title         = {{SUM-QE: a BERT-based Summary Quality Estimation Model}},
               fullbooktitle = {Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019)},
               booktitle     = {EMNLP-IJCNLP 2019},
               year          = {2019},
               address       = {Hong Kong}
               }

A preprint of the paper is available on arXiv.

Demo

A simple example (similar to scr/examples.py) of how to load and run the trained models can be found in this Google Colab notebook.

Environment

In our experiments we used Anaconda and python=3.6. You can set the environment using the following steps:

Create a conda environment and activate it:

 conda create -p SUM_QE_env python=3.6 --no-default-packages 
 
 conda activate SUM_QE_env/

Install tensorflow=1.12:
```
 conda install tensorflow-gpu==1.12
```
Install the remaining requirements
```
 pip install -r requirements.txt
 
```
Add SumQE/ to the path of the environment:
```
 conda develop SumQE/
 
```
Download nltk punkt sentence tokenizer:
```
 python -m nltk.downloader punkt
 
```
Download the binary file containing the pretrained GloVe embeddings from here and move it to SumQE/input directory.
In order to calculate the ROUGE and BLEU scores, clone this repository and add it to the environment:
```
 cd py-rouge
 pip install .
 
```
Copy by hand the file py-rouge/rouge/smart_common_words.txt into the directory: SUM_QE_env/lib/python3.6/site-packages/rouge

Datasets

We used the datasets from the DUC-05, DUC-06 and DUC-07 shared tasks. To ease processing, we constructed a json file for each year (duc_year.json) with all the necessary information organized. In particular, each file contains the peer (system) and model (human) summaries followed by the human scores (the scores given by the annotators), and the ROUGE and BLEU scores that were calculated automatically.

In order to construct your own datasets, you need to follow the structure shown below for each particular year. Make sure that your files are renamed correctly.

project
|  ...
└──input
    └── DUC_2005
        |   linguistic_quality.table
        |   pyramid.txt
        |   Responsiveness.table
        |
        └── summaries
            |  D301.M.250.I.1
            |  ...
    └── DUC_2006
           ...
    └── DUC_2007
           ...

If you have created the above structure correctly, the next step is to construct the datasets with the following command:

python src/make_datasets/make_dataset.py

Pre-trained Models

All the models that used on the paper can be found here. For each dataset (DUC-05, DUC-06, DUC-07) there are 15 models available. 15 extra models were added, trained in all of the 3 DUC datasets.

<table> <tr><th colspan="4" align="center"> Models </ th></tr> <tr><th colspan="4" align="center"> Train years </ th></tr> <tr><th align="center"> 2006 + 2007 </ th><th align="center"> 2005 + 2007 </ th><th align="center"> 2005 + 2006 </ th><th align="center"> 2005 + 2006 + 2007 </ th></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q1_Single%20Task.h5"> BERT_Q1 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q1_Single%20Task.h5"> BERT_Q1 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q1_Single%20Task.h5">BERT_Q1 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q1_Single%20Task.h5">BERT_Q1 (Single Task) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q2_Single%20Task.h5"> BERT_Q2 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q2_Single%20Task.h5"> BERT_Q2 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q2_Single%20Task.h5">BERT_Q2 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q2_Single%20Task.h5">BERT_Q2 (Single Task) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q3_Single%20Task.h5"> BERT_Q3 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q3_Single%20Task.h5"> BERT_Q3 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q3_Single%20Task.h5">BERT_Q3 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q3_Single%20Task.h5">BERT_Q3 (Single Task) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q4_Single%20Task.h5"> BERT_Q4 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q4_Single%20Task.h5"> BERT_Q4 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q4_Single%20Task.h5">BERT_Q4 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q4_Single%20Task.h5">BERT_Q4 (Single Task) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q5_Single%20Task.h5"> BERT_Q5 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q5_Single%20Task.h5"> BERT_Q5 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q5_Single%20Task.h5">BERT_Q5 (Single Task)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q5_Single%20Task.h5">BERT_Q5 (Single Task) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q1_Multi%20Task-1.h5"> BERT_Q1 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q1_Multi%20Task-1.h5"> BERT_Q1 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q1_Multi%20Task-1.h5">BERT_Q1 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q1_Multi%20Task-1.h5">BERT_Q1 (Multi Task-1) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q2_Multi%20Task-1.h5"> BERT_Q2 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q2_Multi%20Task-1.h5"> BERT_Q2 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q2_Multi%20Task-1.h5">BERT_Q2 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q2_Multi%20Task-1.h5">BERT_Q2 (Multi Task-1) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q3_Multi%20Task-1.h5"> BERT_Q3 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q3_Multi%20Task-1.h5"> BERT_Q3 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q3_Multi%20Task-1.h5">BERT_Q3 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q3_Multi%20Task-1.h5">BERT_Q3 (Multi Task-1) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q4_Multi%20Task-1.h5"> BERT_Q4 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q4_Multi%20Task-1.h5"> BERT_Q4 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q4_Multi%20Task-1.h5">BERT_Q4 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q4_Multi%20Task-1.h5">BERT_Q4 (Multi Task-1) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q5_Multi%20Task-1.h5"> BERT_Q5 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q5_Multi%20Task-1.h5"> BERT_Q5 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q5_Multi%20Task-1.h5">BERT_Q5 (Multi Task-1)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q5_Multi%20Task-1.h5">BERT_Q5 (Multi Task-1) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q1_Multi%20Task-5.h5"> BERT_Q1 (Multi Task-5)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q1_Multi%20Task-5.h5"> BERT_Q1 (Multi Task-5)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2007_Q1_Multi%20Task-5.h5">BERT_Q1 (Multi Task-5)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_all_Q1_Multi%20Task-5.h5">BERT_Q1 (Multi Task-5) </a></td></tr> <tr align="center"><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2005_Q2_Multi%20Task-5.h5"> BERT_Q2 (Multi Task-5)</a></td><td><a href="https://archive.org/download/sum-qe/BERT_DUC_2006_Q2_Multi%20Task-5.h5"> BERT_Q2 (Multi Task-5)</a></td><td><a href="https://archive.org/do

SumQE

Install / Use

README

SumQE

Demo

Environment

Datasets

Pre-trained Models