ULTRA
A foundation model for knowledge graph reasoning
Install / Use
/learn @DeepGraphLearning/ULTRAREADME
ULTRA: Towards Foundation Models for Knowledge Graph Reasoning
</div>
PyG implementation of ULTRA and UltraQuery, a foundation model for KG reasoning. Authored by Michael Galkin, Zhaocheng Zhu, and Xinyu Yuan. Logo generated by DALL·E 3.
Overview
ULTRA is a foundation model for knowledge graph (KG) reasoning. A single pre-trained ULTRA model performs link prediction tasks on any multi-relational graph with any entity / relation vocabulary. Performance-wise averaged on 50+ KGs, a single pre-trained ULTRA model is better in the 0-shot inference mode than many SOTA models trained specifically on each graph. Following the pretrain-finetune paradigm of foundation models, you can run a pre-trained ULTRA checkpoint immediately in the zero-shot manner on any graph as well as use more fine-tuning.
ULTRA provides <u>u</u>nified, <u>l</u>earnable, <u>tra</u>nsferable representations for any KG. Under the hood, ULTRA employs graph neural networks and modified versions of NBFNet. ULTRA does not learn any entity and relation embeddings specific to a downstream graph but instead obtains relative relation representations based on interactions between relations.
The original implementation with the TorchDrug framework is available here for reproduction purposes.
This repository is based on PyTorch 2.1 and PyTorch-Geometric 2.4.
Your superpowers ⚡️:
- Use the pre-trained checkpoints to run zero-shot inference and fine-tuning on 57 transductive and inductive datasets.
- Run training and inference with multiple GPUs.
- Pre-train ULTRA on your own mixture of graphs.
- Run evaluation on many datasets sequentially.
- Use the pre-trained checkpoints to run inference and fine-tuning on your own KGs.
- (NEW) Execute complex logical queries on any KG with UltraQuery
Table of contents:
Updates
- Oct 1st, 2024: UltraQuery got accepted at NeurIPS 2024!
- Apr 23rd, 2024: Release of UltraQuery for complex multi-hop logical query answering on any KG (with new checkpoint and 23 datasets).
- Jan 15th, 2024: Accepted at ICLR 2024!
- Dec 4th, 2023: Added a new ULTRA checkpoint
ultra_50gpre-trained on 50 graphs. Averaged over 16 larger transductive graphs, it delivers 0.389 MRR / 0.549 Hits@10 compared to 0.329 MRR / 0.479 Hits@10 of theultra_3gcheckpoint. The inductive performance is still as good! Use this checkpoint for inference on larger graphs. - Dec 4th, 2023: Pre-trained ULTRA models (3g, 4g, 50g) are now also available on the HuggingFace Hub!
Installation
You may install the dependencies via either conda or pip. Ultra PyG is implemented with Python 3.9, PyTorch 2.1 and PyG 2.4 (CUDA 11.8 or later when running on GPUs). If you are on a Mac, you may omit the CUDA toolkit requirements.
From Conda
conda install pytorch=2.1.0 pytorch-cuda=11.8 cudatoolkit=11.8 pytorch-scatter=2.1.2 pyg=2.4.0 -c pytorch -c nvidia -c pyg -c conda-forge
conda install ninja easydict pyyaml -c conda-forge
From Pip
pip install torch==2.1.0 --index-url https://download.pytorch.org/whl/cu118
pip install torch-scatter==2.1.2 torch-sparse==0.6.18 torch-geometric==2.4.0 -f https://data.pyg.org/whl/torch-2.1.0+cu118.html
pip install ninja easydict pyyaml
<details>
<summary> Compilation of the `rspmm` kernel </summary>
To make relational message passing iteration O(V) instead of O(E) we ship a custom rspmm kernel that will be compiled automatically upon the first launch. The rspmm kernel supports transe and distmult message functions, others like rotate will resort to full edge materialization and O(E) complexity.
The kernel can be compiled on both CPUs (including M1/M2 on Macs) and GPUs (it is done only once and then cached). For GPUs, you need a CUDA 11.8+ toolkit with the nvcc compiler. If you are deploying this in a Docker container, make sure to start from the devel images that contain nvcc in addition to plain CUDA runtime.
Make sure your CUDA_HOME variable is set properly to avoid potential compilation errors, eg
export CUDA_HOME=/usr/local/cuda-11.8/
</details>
Checkpoints
We provide two pre-trained ULTRA checkpoints in the /ckpts folder of the same model size (6-layer GNNs per relation and entity graphs, 64d, 168k total parameters) trained on 4 x A100 GPUs with this codebase:
ultra_3g.pth: trained onFB15k237, WN18RR, CoDExMediumfor 800,000 steps, config is in/config/transductive/pretrain_3g.yamlultra_4g.pth: trained onFB15k237, WN18RR, CoDExMedium, NELL995for 400,000 steps, config is in/config/transductive/pretrain_4g.yaml
You can use those checkpoints for zero-shot inference on any graph (including your own) or use it as a backbone for fine-tuning. Both checkpoints are rather small (2 MB each).
Zero-shot performance of the checkpoints compared to the paper version (PyG experiments were run on a single RTX 3090, PyTorch 2.1, PyG 2.4, CUDA 11.8 using the run_many.py script in this repo):
Run Inference and Fine-tuning
The /scripts folder contains 3 executable files:
run.py- run an experiment on a single datasetrun_many.py- run experiments on several datasets sequentially and dump results into a CSV filepretrain.py- a script for pre-training ULTRA on several graphs
The yaml configs in the config folder are provided for both transductive and inductive datasets.
Run a single experiment
The run.py command requires the following arguments:
-c <yaml config>: a path to the yaml config--dataset: dataset name (from the list of datasets)--version: a version of the inductive dataset (see all in datasets), not needed for transductive graphs. For example,--dataset FB15k237Inductive --version v1will load one of the GraIL inductive datasets.--epochs: number of epochs to train,--epochs 0means running zero-shot inference.--bpe: batches per epoch (replaces the length of the dataloader as default value).--bpe 100 --epochs 10means that each epoch consists of 100 batches, and overall training is 1000 batches. Set--bpe nullto use the full length dataloader or comment thebpeline in the yaml configs.--gpus: number of gpu devices, set to--gpus nullwhen running on CPUs,--gpus [0]for a single GPU, or otherwise set the number of GPUs for a distributed setup--ckpt: full path to the one of the ULTRA checkpoints to use (you can use those provided in the repo ot trained on your own). Use--ckpt nullto start training from scratch (or run zero-shot inference on a randomly initialized model, it still might surprise you and demonstrate non-zero performance).
Zero-shot inference setup is --epochs 0 with a given checkpoint ckpt.
Fine-tuning of a checkpoint is when epochs > 0 with a given checkpoint.
An example command for an inductive dataset to run on a CPU:
python script/run.py -c config/inductive/inference.yaml --dataset FB15k237Inductive --version v1 --epochs 0 --bpe null --gpus null --ckpt /path/to/ultr
Related Skills
node-connect
352.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
352.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
352.0kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
