AMRBART

Code for our paper "Graph Pre-training for AMR Parsing and Generation" in ACL2022

Generate Convert Improve

Install / Use

/learn @goodbai-nlp/AMRBART

About this skill

Quality Score

0/100

README

AMRBART

The refactored implementation for ACL2022 paper "Graph Pre-training for AMR Parsing and Generation". You may find our paper here (Arxiv). The original implementation is avaliable here

News🎈

(2022/12/10) fix max_length bugs in AMR parsing and update results.
(2022/10/16) release the AMRBART-v2 model which is simpler, faster, and stronger.

Requirements

python 3.8
pytorch 1.8
transformers 4.21.3
datasets 2.4.0
Tesla V100 or A100

We recommend to use conda to manage virtual environments:

conda env update --name <env> --file requirements.yml

Data Processing

You may download the AMR corpora at LDC.

Please follow this respository to preprocess AMR graphs:

bash run-process-acl2022.sh

Usage

Our model is avaliable at huggingface. Here is how to initialize a AMR parsing model in PyTorch:

from transformers import BartForConditionalGeneration
from model_interface.tokenization_bart import AMRBartTokenizer      # We use our own tokenizer to process AMRs

model = BartForConditionalGeneration.from_pretrained("xfbai/AMRBART-large-finetuned-AMR3.0-AMRParsing-v2")
tokenizer = AMRBartTokenizer.from_pretrained("xfbai/AMRBART-large-finetuned-AMR3.0-AMRParsing-v2")

Pre-training

bash run-posttrain-bart-textinf-joint-denoising-6task-large-unified-V100.sh "facebook/bart-large"

Fine-tuning

For AMR Parsing, run

bash train-AMRBART-large-AMRParsing.sh "xfbai/AMRBART-large-v2"

For AMR-to-text Generation, run

bash train-AMRBART-large-AMR2Text.sh "xfbai/AMRBART-large-v2"

Evaluation

cd evaluation

For AMR Parsing, run

bash eval_smatch.sh /path/to/gold-amr /path/to/predicted-amr

For better results, you can postprocess the predicted AMRs using the BLINK tool following SPRING.

For AMR-to-text Generation, run

bash eval_gen.sh /path/to/gold-text /path/to/predicted-text

Inference on your own data

If you want to run our code on your own data, try to transform your data into the format here, then run

For AMR Parsing, run

bash inference_amr.sh "xfbai/AMRBART-large-finetuned-AMR3.0-AMRParsing-v2"

For AMR-to-text Generation, run

bash inference_text.sh "xfbai/AMRBART-large-finetuned-AMR3.0-AMR2Text-v2"

Pre-trained Models

Pre-trained AMRBART

|Setting| Params | checkpoint | | :----: | :----: |:---:| | AMRBART-large | 409M | model |

Fine-tuned models on AMR-to-Text Generation

|Setting| BLEU(JAMR_tok) | Sacre-BLEU | checkpoint | output | | :----: | :----: |:---:| :----: | :----: | | AMRBART-large (AMR2.0) | 50.76 | 50.44 | model | output | | AMRBART-large (AMR3.0) | 50.29 | 50.38 | model | output |

To get the tokenized bleu score, you need to use the scorer we provide here. We use this script in order to ensure comparability with previous approaches.

Fine-tuned models on AMR Parsing

|Setting| Smatch(amrlib) | Smatch(amr-evaluation) | Smatch++(smatchpp) | checkpoint | output | | :----: | :----: |:---: |:---:| :----: | :----: | | AMRBART-large (AMR2.0) | 85.5 | 85.3 | 85.4 | model | output | | AMRBART-large (AMR3.0) | 84.4 | 84.2 | 84.3 | model | output |

Acknowledgements

We thank authors of SPRING, amrlib, and BLINK that share open-source scripts for this project.

References

@inproceedings{bai-etal-2022-graph,
    title = "Graph Pre-training for {AMR} Parsing and Generation",
    author = "Bai, Xuefeng  and
      Chen, Yulong  and
      Zhang, Yue",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-long.415",
    pages = "6001--6015"
}

Related Skills

node-connect

353.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.7k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

353.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

353.3k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。