LLM4VKG
LLM4VKG: Leveraging Large Language Models for Virtual Knowledge Graph Construction
Install / Use
/learn @HomuraT/LLM4VKGREADME
LLM4VKG: Leveraging Large Language Models for Virtual Knowledge Graph Construction
LLM4VKG is a framework that leverages Large Language Models (LLMs) for Virtual Knowledge Graph (VKG) construction. By integrating established mapping patterns, LLM4VKG effectively structures and maps ontologies, making them more comprehensive and practical. Additionally, we developed an automated evaluation framework to simplify the assessment process.
Installation
Install UV
First, install UV (a fast Python package installer and resolver). You can install it using one of the following methods:
Using pip:
pip install uv
Using curl (Linux/macOS):
curl -LsSf https://astral.sh/uv/install.sh | sh
Using Homebrew (macOS):
brew install uv
For more installation options, visit: https://github.com/astral-sh/uv
Install Dependencies
After installing UV, install the project dependencies:
uv sync
This will create a virtual environment and install all dependencies specified in pyproject.toml.
Requirements
Please refer to the pyproject.toml file for a list of dependencies.
Resources
The following external resources are required. Please download and place them in the ./resources directory:
Prepare for Run
- Instantiate the database according to the SQL dump file in
./datasets/rodi/*/dump.sql. And then set the corresponding DB config insrc/db_utils/db_utils.py. - Set API config for LLMs in
src/llm/resources/ampi.json.
How to Run
All scripts are located in the script/ directory and use UV to run the Python programs. Make sure you have completed the installation steps above before running.
-
Mapping pattern recognition:
./script/MPR.sh -
Ontology completion and mapping generation:
./script/OC_MG.sh -
Evaluate:
uv run python rodi_evaluate.py
Alternative Scripts
script/MPR_infk.sh/script/MPR_nofk.sh: Mapping pattern recognition with different configurationsscript/OC_MG_infk.sh/script/OC_MG_nofk.sh: Ontology completion and mapping generation with different configurationsscript/dataEnrichment.sh: Data enrichment script
Note: Make sure the scripts have execute permissions. If not, run:
chmod +x script/*.sh
Results
The directory outputs/ will contain the full outputs of LLM4VKG. This includes the generated ontology, mappings, and a comprehensive evaluation report detailing performance metrics and validation outcomes.
Acknowledgements
This work utilizes the RODI (Relational-to-Ontology Mapping Quality Benchmark) dataset. We thank the creators and maintainers for their contribution.
The RODI benchmark can be found at: https://github.com/chrpin/rodi
Citation
If you find this work useful, please consider citing our paper accepted at IJCAI 2025:
@inproceedings{Xiao2025LLM4VKG,
author = {Guohui Xiao and Lin Ren and Guilin Qi and Haohan Xue and Marco Di Panfilo and Davide Lanti},
title = {LLM4VKG: Leveraging Large Language Models for Virtual Knowledge Graph Construction},
booktitle = {Proceedings of the 34th International Joint Conference on Artificial Intelligence (IJCAI-25)},
year = {2025}
}
