SkillAgentSearch skills...

Procnet

Document-Level Multi-Event Extraction with Event Proxy Nodes and Hausdorff Distance Minimization

Install / Use

/learn @xnyuwg/Procnet
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Document-Level Multi-Event Extraction with Event Proxy Nodes and Hausdorff Distance Minimization

Code for the paper: "Document-Level Multi-Event Extraction with Event Proxy Nodes and Hausdorff Distance Minimization"

Introduction

Document-level multi-event extraction aims to extract the structural information from a given document automatically. Most recent approaches usually involve two steps: (1) modeling entity interactions; (2) decoding entity interactions into events. However, such approaches ignore a global view of inter-dependency of multiple events. Moreover, an event is decoded by iteratively merging its related entities as arguments, which might suffer from error propagation and is computationally inefficient. In this paper, we propose an alternative approach for document-level multi-event extraction with event proxy nodes and Hausdorff distance minimization. The event proxy nodes, representing pseudo-events, are able to build connections with other event proxy nodes, essentially capturing global information. The Hausdorff distance makes it possible to compare the similarity between the set of predicted events and the set of ground-truth events. By directly minimizing Hausdorff distance, the model is trained towards the global optimum directly, which improves performance and reduces training time. Experimental results show that our model outperforms previous state-of-the-art method in F1-score on two datasets with only a fraction of training time.

architecture

Dependency

torch
torch-geometric
transformers
numpy
tqdm

The metric code procnet/dee/dee_metric.py is from dee_metric.py of Doc2EDAG.

Dataset

Please unzip data.zip at Data/ as:

>> cd Data
>> unzip data.zip

After unzipping it, you should see:

├─ Data/
    ├── dev.json
    ├── test.json
    ├── train.json

data.zip is from Data.zip of Doc2EDAG.

Usage

To train the model:

>> bash run.sh

During training, brief results are printed and detailed results are saved at Result/ as:

├─ Result/
    ├── {name of this run}/
        ├── {name of this run}_001.json
        ├── {name of this run}_002.json
        ...
        ├── {name of this run}_{epoch}.json
        ...

Citation

If our work help you, please cite the paper as follows:

@inproceedings{wang-etal-2023-document,
    title = "Document-Level Multi-Event Extraction with Event Proxy Nodes and Hausdorff Distance Minimization",
    author = "Wang, Xinyu  and
      Gui, Lin  and
      He, Yulan",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.acl-long.563",
    pages = "10118--10133",
}
View on GitHub
GitHub Stars16
CategoryDevelopment
Updated1mo ago
Forks1

Languages

Python

Security Score

75/100

Audited on Feb 15, 2026

No findings