SpatialLLM

[JAG'26] SpatialLLM: From Multi-modality Data to Urban Spatial Intelligence

Generate Convert Improve

Install / Use

/learn @WHU-USI3DV/SpatialLLM

About this skill

Quality Score

0/100

README

SpatialLLM：Enhancing Large Language Models for Urban Spatial Intelligence

Paper

📋 Overview

overview

<p align="justify"> SpatialLLM is a comprehensive framework for enhancing Large Language Models with urban spatial understanding capabilities. This project integrates point cloud processing, OpenStreetMap (OSM) data, and multi-view images to structured scene text, enabling LLMs to perform complex spatial reasoning tasks in urban environments. </p>

🚀 install

# Create and activate conda environment
conda create -n spatialllm python=3.8
conda activate spatialllm

# Install geospatial dependencies via conda
conda install -c conda-forge geopandas
conda install -c conda-forge geopy
conda install -c conda-forge gdal

# Install remaining dependencies via pip
pip install -r requirements.txt

🔧 Data Processing Pipeline

Step 1: Extract OSM Information

python process/osm_process.py \
    --shp_path Directory containing OSM shapefiles (buildings, roads, etc.) \
    --output Output JSON file path

Step 2: Automatic Point Cloud Annotation

python process/auto_annotate.py \
    --work_dir Path containing all input/output files \
    --shp_dir Directory containing OSM shapefiles \
    --pc_file Point cloud filename (`.ply` or `.txt`) in `work_dir` \
    --control_txt Control text file for annotation parameters \
    --res Raster resolution

# Control Points File
# Format: `x y z lon lat` (one point per line)

Step 3: Generate Scene Graph

python process/generate_scene_graph.py \
    --pc_file Annotated point cloud from Step 2 \
    --shp_dir OSM shapefiles directory \
    --osm_map_file Instance mapping file from Step 2 \
    --osm_json_file **INPUT** OSM features to be enriched \
    --output_json **OUTPUT** Final structured scene graph

Infer

python infer/inference.py

💡 Tip: You can also directly use the structured scene text as context in web-based LLM interfaces (ChatGPT, Claude, etc.) for interactive spatial reasoning conversations.

📚 Examples

We provide multiple examples demonstrating various spatial understanding capabilities. Please refer to the examples/ directory for detailed results.

🤝 Acknowledgement

SpatialLLM is built upon the extremely wonderful UrbanBIS.

Contact us

If you find this repo helpful, please give us a star. For any questions, please contact us via chenjb67@whu.edu.cn.

Related Skills

node-connect

344.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

99.2k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

344.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

344.4k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。