LLMNodeBed
[ICML 2025] Official implementation for paper "A Comprehensive Analysis on LLM-based Node Classification Algorithms"
Install / Use
/learn @WxxShirley/LLMNodeBedREADME
LLMNodeBed
<p align="center"> <a href="https://llmnodebed.github.io/"><img src="https://img.shields.io/badge/🌐-Website-red" height="25"></a> <a href="https://arxiv.org/abs/2502.00829"><img src="https://img.shields.io/badge/📝-Paper-blue" height="25"></a> <a href="https://huggingface.co/datasets/xxwu/LLMNodeBed/tree/main"><img src="https://img.shields.io/badge/🤗-Dataset-green" height="25"></a> </p>This repository is the official implementation for our ICML 2025 paper: When Do LLMs Help With Node Classification? A Comprehensive Analysis. It provides a standardized framework for evaluating LLM-based node classification methods, including 14 datasets, 8 LLM-based algorithms, and 3 learning paradigms.
Please consider citing or giving a 🌟 if our repository is helpful to your work!
@inproceedings{wu2025llmnodebed,
title={When Do LLMs Help With Node Classification? A Comprehensive Analysis},
author={Xixi Wu and Yifei Shen and Fangzhou Ge and Caihua Shan and Yizhu Jiao and Xiangguo Sun and Hong Cheng},
year={2025},
booktitle={International Conference on Machine Learning},
organization={PMLR},
url={https://arxiv.org/abs/2502.00829},
}
🎙️ News
🎉 [2025-05-01] Our paper is accepted to ICML 2025. The camera ready paper, integration of more baseline methods, and corresponding blogs will be released soon!
📅 [2025-02-04] The code for LLMNodebed, along with the project pages and paper, has now been released! 🧨
📝 Table of Contents
🚀 Quick Start
0. Environment Setup
To get started, follow these steps to set up your Python environment:
conda create -n NodeBed python=3.10
conda activate NodeBed
pip install torch torch_geometric transformers peft pytz scikit-learn torch_scatter torch_sparse
Some packages might be missed for specific algorithms. Check the algorithm READMD or error logs to identify any missing dependencies and install them accordingly.
1. LLM Preparation
-
Close-source LLMs like GPT-4o, DeepSeek-Chat:
Add API keys to
LLMZeroShot/Direct/api_keys.py -
Open-source LLMs like Mistral-7B, Qwen:
Download models from HuggingFace (e.g., Mistral-7B). Then, update model paths in
common/model_path.pyas you actual saving paths.Example paths:
MODEL_PATHs = { "MiniLM": "sentence-transformers/all-MiniLM-L6-v2", "Mistral-7B": "mistralai/Mistral-7B-Instruct-v0.2", "Llama-8B": "meta-llama/Llama-3.1-8B-Instruct", # See full list in common/model_path.py }
2. Datasets
Download datasets either from Google Drive or HuggingFace and unzip into the datasets folder.
Before running LLM-based algorithms, please generate LM / LLM-encoded embeddings as follows:
cd LLMEncoder/GNN
python3 embedding.py --dataset=cora --encoder_name=roberta # LM embeddings
python3 embedding.py --dataset=cora --encoder_name=Mistral-7B # LLM embeddings
3. (Optional) Deploy Local LLMs
For LLM Direct Inference using open-source LLMs, we depoly them as local services based on the FastChat framework.
# Install dependencies
pip install vllm "fschat[model_worker,webui]"
# Start services
python3 -m fastchat.serve.controller --host 127.0.0.1
CUDA_VISIBLE_DEVICES=0 python3 -m fastchat.serve.vllm_worker --model-path mistralai/Mistral-7B-Instruct-v0.2 --host 127.0.0.1
python3 -m fastchat.serve.openai_api_server --host 127.0.0.1 --port 8008
Then, the Mistral-7B model can be invoked via the url http://127.0.0.1:8008/v1/chat/completions.
4. Run Algorithms
Refer to method-specific READMEs for execution details:
-
LLM-as-Encoder:
LLMEncoder/README.md -
LLM-as-Predictor:
LLMPredictor/README.md -
LLM-as-Reasoner:
LLMReasoner/README.md -
Zero-shot Methods:
LLMZeroShot/README.md
📖 Code Structure
LLMNodeBed/
├── LLMEncoder/ # LLM-as-Encoder (GNN, ENGINE)
├── LLMPredictor/ # LLM-as-Predictor (GraphGPT, LLaGA, Instruction Tuning)
├── LLMReasoner/ # LLM-as-Reasoner (TAPE)
├── LLMZeroShot/ # Zero-shot Methods (Direct Inference, ZeroG)
├── common/ # Shared utilities
├── datasets/ # Dataset storage
├── results/ # Experiment outputs
└── requirements.txt
🔧 Supported Methods
| Method | Veneue | Official Implementation | Our Implementation |
| ---------------------------- | -------- | ------------------------------------------ | --------------------------------- |
| TAPE | ICLR'24 | link | LLMReasoner/TAPE |
| ENGINE | IJCAI'24 | link | LLMEncoder/ENGINE |
| GraphGPT | SIGIR'24 | link | LLMPredictor/GraphGPT |
| LLaGA | ICML'24 | link| LLMPredictor/LLaGA |
| ZeroG | KDD'24 | link | LLMZeroShot/ZeroG |
| $\text{GNN}_{\text{LLMEmb}}$ | - | Ours Proposed | LLMEncoder/GNN |
| LLM Instruction Tuning | - | Ours Implemented | LLMPredictor/Instruction Tuning |
| Direct Inference | - | Ours Implemented | LLMZeroShot/Direct |
📮 Contact
If you have any further questions about usage, reproducibility, or would like to discuss, please feel free to open an issue or contact the authors via email at xxwu@se.cuhk.edu.hk.
🙏 Acknowledgements
We thank the authors of TAPE, ENGINE, GraphGPT, LLaGA, and ZeroG for their open-source implementations. Part of our framework is inspired by GLBench.
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
399Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
18.7kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
