LLMNodeBed

[ICML 2025] Official implementation for paper "A Comprehensive Analysis on LLM-based Node Classification Algorithms"

Generate Convert Improve

Install / Use

/learn @WxxShirley/LLMNodeBed

About this skill

Quality Score

0/100

README

LLMNodeBed

This repository is the official implementation for our ICML 2025 paper: When Do LLMs Help With Node Classification? A Comprehensive Analysis. It provides a standardized framework for evaluating LLM-based node classification methods, including 14 datasets, 8 LLM-based algorithms, and 3 learning paradigms.

Please consider citing or giving a 🌟 if our repository is helpful to your work!

@inproceedings{wu2025llmnodebed,
      title={When Do LLMs Help With Node Classification? A Comprehensive Analysis}, 
      author={Xixi Wu and Yifei Shen and Fangzhou Ge and Caihua Shan and Yizhu Jiao and Xiangguo Sun and Hong Cheng},
      year={2025},
      booktitle={International Conference on Machine Learning},
      organization={PMLR},
      url={https://arxiv.org/abs/2502.00829}, 
}

🎙️ News

🎉 [2025-05-01] Our paper is accepted to ICML 2025. The camera ready paper, integration of more baseline methods, and corresponding blogs will be released soon!

📅 [2025-02-04] The code for LLMNodebed, along with the project pages and paper, has now been released! 🧨

🚀 Quick Start

0. Environment Setup

To get started, follow these steps to set up your Python environment:

conda create -n NodeBed python=3.10
conda activate NodeBed
pip install torch torch_geometric transformers peft pytz scikit-learn torch_scatter torch_sparse

Some packages might be missed for specific algorithms. Check the algorithm READMD or error logs to identify any missing dependencies and install them accordingly.

1. LLM Preparation

Close-source LLMs like GPT-4o, DeepSeek-Chat:

Add API keys to LLMZeroShot/Direct/api_keys.py

Open-source LLMs like Mistral-7B, Qwen:

Download models from HuggingFace (e.g., Mistral-7B). Then, update model paths in common/model_path.py as you actual saving paths.

Example paths:

MODEL_PATHs = {
  "MiniLM": "sentence-transformers/all-MiniLM-L6-v2",
  "Mistral-7B": "mistralai/Mistral-7B-Instruct-v0.2",
  "Llama-8B": "meta-llama/Llama-3.1-8B-Instruct",
  # See full list in common/model_path.py
}

2. Datasets

Download datasets either from Google Drive or HuggingFace and unzip into the datasets folder.

Before running LLM-based algorithms, please generate LM / LLM-encoded embeddings as follows:

cd LLMEncoder/GNN

python3 embedding.py --dataset=cora --encoder_name=roberta      # LM embeddings
python3 embedding.py --dataset=cora --encoder_name=Mistral-7B  # LLM embeddings

3. (Optional) Deploy Local LLMs

For LLM Direct Inference using open-source LLMs, we depoly them as local services based on the FastChat framework.

# Install dependencies
pip install vllm "fschat[model_worker,webui]"

# Start services
python3 -m fastchat.serve.controller --host 127.0.0.1
CUDA_VISIBLE_DEVICES=0 python3 -m fastchat.serve.vllm_worker --model-path mistralai/Mistral-7B-Instruct-v0.2 --host 127.0.0.1
python3 -m fastchat.serve.openai_api_server --host 127.0.0.1 --port 8008

Then, the Mistral-7B model can be invoked via the url http://127.0.0.1:8008/v1/chat/completions.

4. Run Algorithms

Refer to method-specific READMEs for execution details:

LLM-as-Encoder: LLMEncoder/README.md
LLM-as-Predictor: LLMPredictor/README.md
LLM-as-Reasoner: LLMReasoner/README.md
Zero-shot Methods: LLMZeroShot/README.md

📖 Code Structure

LLMNodeBed/
├── LLMEncoder/           # LLM-as-Encoder (GNN, ENGINE)
├── LLMPredictor/         # LLM-as-Predictor (GraphGPT, LLaGA, Instruction Tuning)
├── LLMReasoner/          # LLM-as-Reasoner (TAPE)
├── LLMZeroShot/          # Zero-shot Methods (Direct Inference, ZeroG)
├── common/               # Shared utilities
├── datasets/             # Dataset storage
├── results/              # Experiment outputs
└── requirements.txt

🔧 Supported Methods

| Method | Veneue | Official Implementation | Our Implementation | | ---------------------------- | -------- | ------------------------------------------ | --------------------------------- | | TAPE | ICLR'24 | link | LLMReasoner/TAPE | | ENGINE | IJCAI'24 | link | LLMEncoder/ENGINE | | GraphGPT | SIGIR'24 | link | LLMPredictor/GraphGPT | | LLaGA | ICML'24 | link| LLMPredictor/LLaGA | | ZeroG | KDD'24 | link | LLMZeroShot/ZeroG | | $\text{GNN}_{\text{LLMEmb}}$ | - | Ours Proposed | LLMEncoder/GNN | | LLM Instruction Tuning | - | Ours Implemented | LLMPredictor/Instruction Tuning | | Direct Inference | - | Ours Implemented | LLMZeroShot/Direct |

📮 Contact

If you have any further questions about usage, reproducibility, or would like to discuss, please feel free to open an issue or contact the authors via email at xxwu@se.cuhk.edu.hk.

🙏 Acknowledgements

We thank the authors of TAPE, ENGINE, GraphGPT, LLaGA, and ZeroG for their open-source implementations. Part of our framework is inspired by GLBench.

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

399

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

last30days-skill

18.7k

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary