EAkit
Entity Alignment toolkit (EAkit), a lightweight, easy-to-use and highly extensible PyTorch implementation of many entity alignment algorithms.
Install / Use
/learn @THU-KEG/EAkitREADME
EAkit
Entity Alignment toolkit (EAkit), a lightweight, easy-to-use and highly extensible PyTorch implementation of many entity alignment algorithms. The algorithm list is from Entity_Alignment_Papers.
Table of Contents
Design
We sort out the existing entity alignment algorithms and modularizing the composition of them, and then define an abstract structure as 1 Encoder - N Decoder(s), where different modules are regarded as specific implementations of different encoders and decoders, so as to restore the structures of the algorithms.

Organization
./EAkit
├── README.md # Doc of EAkit
├── _runs # Tensorboard log dir
├── data # Datasets. (unzip data.zip)
│ └── DBP15K
├── examples # Shell scripts of implemented algorithms
│ ├── Tensorboard.sh # Start Tensorboard visualization
│ ├── run_BootEA.sh
│ ├── run_ComplEx.sh
│ ├── run_ConvE.sh
│ ├── run_DistMult.sh
│ ├── run_GCN-Align.sh
│ ├── run_HAKE.sh
│ ├── run_KECG.sh
│ ├── run_MMEA.sh
│ ├── run_MTransE.sh
│ ├── run_NAEA.sh
│ ├── run_RotatE.sh
│ ├── run_TransE.sh
│ ├── run_TransEdge.sh
│ ├── run_TransH.sh
│ └── run_TransR.sh
├── load_data.py # Load datasets. (data adapter)
├── models.py # Encoders & Decoders
├── run.py # Main
├── semi_utils.py # Bootstrap strategy
└── utils.py # Sampling methods, ...
Usage
Run an implemented model
- Start TensorBoard for metrics visualization (run under
examples/):
./Tensorboard.sh
- Modify and run a script as follow (examples are under
examples/):
CUDA_VISIBLE_DEVICES=0 python3 run.py --log gcnalign \
--data_dir "data/DBP15K/zh_en" \
--rate 0.3 \
--epoch 1000 \
--check 10 \
--update 10 \
--train_batch_size -1 \
--encoder "GCN-Align" \
--hiddens "100,100,100" \
--decoder "Align" \
--sampling "N" \
--k "25" \
--margin "1" \
--alpha "1" \
--feat_drop 0.0 \
--lr 0.005 \
--train_dist "euclidean" \
--test_dist "euclidean"
In detail, the following methods are currently implemented:
Semantic Matching Models
| Method | | Encoder | Decoder | | ---- | ---- | ---- | ---- | | MTransE from Chen et al. (IJCAI 2017) | [sh], [origin] | None | TransE, MTransE_Align | | BootEA from Sun et al. (IJCAI 2018) | [sh], [origin] | None | AlignEA | | TransEdge from Sun et al. (ISWC 2019) | [sh], [origin] | None | TransEdge | | MMEA from Shi et al. (EMNLP 2019) | [sh], [origin] | None | MMEA |
GNN-based Models
| Method | | Encoder | Decoder | | ---- | ---- | ---- | ---- | | GCN-Align from Wang et al. (EMNLP 2018) | [sh], [origin] | GCN-Align | Align | | NAEA from Zhu et al. (IJCAI 2019) | [sh], [origin] | NAEA | [N_TransE], N_TransE, N_R_Align | | KECG from Li et al. (EMNLP 2019) | [sh], [origin] | KECG | TransE, Align |
KE-based Models
| Method | | Encoder | Decoder | | ---- | ---- | ---- | ---- | | TransE from Bordes et al. (NIPS 2013) | [sh], | None | TransE | | TransH from Wang et al. (AAAI 2014) | [sh], | None | TransH | | TransR from Lin et al. (AAAI 2015) | [sh], | None | TransR | | RotatE from Sun et al. (ICLR 2019) | [sh], | None | RotatE | | HAKE from Zhang et al. (AAAI 2020) | [sh], | None | HAKE | | DistMult from Yang et al. (ICLR 2015) | [sh], | None | DistMult | | ComplEx from Trouillon et al. (ICML 2016) | [sh], | None | ComplEx | | ConvE from Dettmers et al. (AAAI 2018) | [sh], | None | ConvE |
Results
Results on DBP15K(zh_en, ja_en, fr_en).
| | Hits@1 | Hits@10 | MRR | Hits@1 | Hits@10 | MRR | Hits@1 | Hits@10 | MRR | |-|-|-|-|-|-|-|-|-|-| | MTransE | 0.419 | 0.753 | 0.535 | 0.433 | 0.773 | 0.549 | 0.407 | 0.751 | 0.526 | | BootEA | 0.490 | 0.793 | 0.593 | 0.499 | 0.813 | 0.605 | 0.515 | 0.838 | 0.623 | | TransEdge | 0.519 | 0.813 | 0.621 | 0.526 | 0.825 | 0.632 | 0.397 | 0.824 | 0.543 | | MMEA | 0.405 | 0.672 | 0.499 | 0.397 | 0.680 | 0.496 | 0.442 | 0.749 | 0.550 | | GCN-Align | 0.410 | 0.756 | 0.527 | 0.442 | 0.810 | 0.566 | 0.430 | 0.813 | 0.557 | | NAEA | 0.323 | 0.481 | 0.381 | 0.311 | 0.457 | 0.363 | 0.307 | 0.460 | 0.362 | | KECG | 0.467 | 0.815 | 0.586 | 0.485 | 0.843 | 0.605 | 0.479 | 0.844 | 0.602 | | TransE | 0.343 | 0.634 | 0.441 | 0.365 | 0.710 | 0.480 | 0.374 | 0.735 | 0.493 | | TransH | 0.436 | 0.735 | 0.540 | 0.450 | 0.778 | 0.561 | 0.485 | 0.821 | 0.599 | | TransR | 0.371 | 0.697 | 0.481 | 0.368 | 0.709 | 0.484 | 0.378 | 0.741 | 0.497 | | RotatE | 0.423 | 0.754 | 0.534 | 0.448 | 0.785 | 0.561 | 0.439 | 0.800 | 0.560 | | HAKE | 0.288 | 0.588 | 0.391 | 0.319 | 0.607 | 0.421 | 0.319 | 0.638 | 0.428 | | DistMult | 0.180 | 0.400 | 0.255 | 0.058 | 0.179 | 0.099 | 0.095 | 0.285 | 0.157 | | ComplEx | 0.115 | 0.265 | 0.166 | 0.063 | 0.251 | 0.146 | 0.141 | 0.332 | 0.206 | | ConvE | 0.210 | 0.466 | 0.299 | 0.339 | 0.556 | 0.415 | 0.350 | 0.602 | 0.439 |
Write a new model
- Divide the algorithm at the abstract level to obtain the structure of 1 (or 0) Encoder and 1 (or more) Decoder(s).
- Register the modules and add extra parameters in the top-level encoder (class Encoder) and top-level decoder (class Decoder) in
models.py. - Implement the concrete encoding module (class Encoder_Instance) and decoding module(s) (class Decoder_Instance) according to the given template.
- Write an execution script (XXX.sh) with parameter settings to run the new model.
- (Adapt a new dataset in
load_data.py, and add a new sampling strategy inutils.py.)

Dataset
(Currently, EAkit only supports DBP15K, but it is easy to adapt to other datasets.)
- DBP15K is from the "mapping" folder of JAPE(But need to combine "ref_ent_ids" and "sup_ent_ids" into a single file named "ill_ent_ids")
Here, you can directly unpack the data file after downloading:
unzip data.zip
Reqirements
- Python3 (tested on 3.7.7)
- PyTorch (tested on 1.4.0)
- PyTorch Geometric (PyG) (tested on 1.4.3)
- TensorBoard (tested on 2.0.2)
- Numpy
- Scipy
- Scikit-learn
- Graph-tool (if use bootstrapping)
TODO
- [ ] Results of
Related Skills
node-connect
350.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
350.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
350.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
