MHGRN

Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering (EMNLP 2020)

Generate Convert Improve

Install / Use

/learn @INK-USC/MHGRN

About this skill

Quality Score

0/100

README

Multi-Hop Graph Relation Networks (EMNLP 2020)

This is the repo of our EMNLP'20 paper:

Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering
Yanlin Feng*, Xinyue Chen*, Bill Yuchen Lin, Peifeng Wang, Jun Yan and Xiang Ren.
EMNLP 2020.
*=equal contritbution

This repository also implements other graph encoding models for question answering (including vanilla LM finetuning).

RelationNet
R-GCN
KagNet
GConAttn
KVMem
MHGRN (or. MultiGRN)

Each model supports the following text encoders:

LSTM
GPT
BERT
XLNet
RoBERTa

Resources

We provide preprocessed ConceptNet and pretrained entity embeddings for your own usage. These resources are independent of the source code.

Note that the following reousrces can be download here.

ConceptNet (5.6.0)

| Description | Downloads | Notes | | ---------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | | Entity Vocab | entity-vocab | one entity per line, space replaced by '_' | | Relation Vocab | relation-vocab | one relation per line, merged | | ConceptNet (CSV format) | conceptnet-5.6.0-csv | English tuples extracted from the full conceptnet with merged relations | | ConceptNet (NetworkX format) | conceptnet-5.6.0-networkx | NetworkX pickled format, pruned by filtering out stop words |

Entity Embeddings (Node Features)

Entity embeddings are packed into a matrix of shape (#ent, dim) and stored in numpy format. Use np.load to read the file. You may need to download the vocabulary files first.

| Embedding Model | Dimensionality | Description | Downloads | | --------------- | -------------- | --------------------------------------------------------- | ------------------------------------------------------------ | | TransE | 100 | Obtained using OpenKE with optim=sgd, lr=1e-3, epoch=1000 | entities relations | | NumberBatch | 300 | https://github.com/commonsense/conceptnet-numberbatch | entities | | BERT-based | 1024 | Provided by Zhengwei | entities |

Dependencies

Python >= 3.6
PyTorch == 1.1.0
transformers == 2.0.0
tqdm
dgl == 0.3.1 (GPU version)
networkx == 2.3

Run the following commands to create a conda environment (assume CUDA10):

conda create -n krqa python=3.6 numpy matplotlib ipython
source activate krqa
conda install pytorch=1.1.0 torchvision cudatoolkit=10.0 -c pytorch
pip install dgl-cu100==0.3.1
pip install transformers==2.0.0 tqdm networkx==2.3 nltk spacy==2.1.6
python -m spacy download en

Usage

1. Download Data

First, you need to download all the necessary data in order to train the model:

git clone https://github.com/INK-USC/MHGRN.git
cd MHGRN
bash scripts/download.sh

The script will:

Download the CommonsenseQA dataset
Download ConceptNet
Download pretrained TransE embeddings

2. Preprocess

To preprocess the data, run:

python preprocess.py

By default, all available CPU cores will be used for multi-processing in order to speed up the process. Alternatively, you can use "-p" to specify the number of processes to use:

python preprocess.py -p 20

The script will:

Convert the original datasets into .jsonl files (stored in data/csqa/statement/)
Extract English relations from ConceptNet, merge the original 42 relation types into 17 types
Identify all mentioned concepts in the questions and answers
Extract subgraphs for each q-a pair

The preprocessing procedure takes approximately 3 hours on a 40-core CPU server. Most intermediate files are in .jsonl or .pk format and stored in various folders. The resulting file structure will look like:

.
├── README.md
└── data/
    ├── cpnet/                 (prerocessed ConceptNet)
    ├── glove/                 (pretrained GloVe embeddings)
    ├── transe/                (pretrained TransE embeddings)
    └── csqa/
        ├── train_rand_split.jsonl
        ├── dev_rand_split.jsonl
        ├── test_rand_split_no_answers.jsonl
        ├── statement/             (converted statements)
        ├── grounded/              (grounded entities)
        ├── paths/                 (unpruned/pruned paths)
        ├── graphs/                (extracted subgraphs)
        ├── ...

3. Hyperparameter Search (optional)

To search the parameters for RoBERTa-Large on CommonsenseQA:

bash scripts/param_search_lm.sh csqa roberta-large

To search the parameters for BERT+RelationNet on CommonsenseQA:

bash scripts/param_search_rn.sh csqa bert-large-uncased

4. Training

Each graph encoding model is implemented in a single script:

| Graph Encoder | Script | Description | | ------------------------------------------------------------ | ----------- | ------------------------------------------------------------ | | None | lm.py | w/o knowledge graph | | Relation Network | rn.py | | | R-GCN | rgcn.py | Use --gnn_layer_num and --num_basis to specify #layer and #basis | | KagNet | kagnet.py | Adapted from https://github.com/INK-USC/KagNet, still tuning | | Gcon-Attn | gconattn.py | | | KV-Memory | kvmem.py | | | MHGRN | grn.py | |

Some important command line arguments are listed as follows (run python {lm,rn,rgcn,...}.py -h for a complete list):

| Arg | Values | Description | Notes | | ------------------------------- | ---------------------------------------------------------- | -------------------------------- | ------------------------------------------------------------ | | --mode | {train, eval, ...} | Training or Evaluation | default=train | | -enc, --encoder | {lstm, openai-gpt, bert-large-unased, roberta-large, ....} | Text Encoer | Model names (except for lstm) are the ones used by huggingface-transformers, default=bert-large-uncased | | --optim | {adam, adamw, radam} | Optimizer | default=radam | | -ds, --dataset | {csqa, obqa} | Dataset | default=csqa | | -ih, --inhouse | {0, 1} | Run In-house Split | default=1, only applicable to CSQA | | --ent_emb | {transe, numberbatch, tzw} | Entity Embeddings | default=tzw (BERT-based node features) | | -sl, --max_seq_len | {32, 64, 128, 256} | Maximum Sequence Length | Use 128 or 256 for datasets that contain long sentences! default=64 | | -elr, --encoder_lr | {1e-5, 2e-5, 3e-5, 6e-5, 1e-4} | Text Encoder LR | dataset specific and text encoder specific, default values in utils/parser_utils.py | | -dlr, --decoder_lr | {1e-4, 3e-4, 1e-3,

Related Skills

node-connect

344.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

96.8k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

344.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

344.1k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。