SpaGBOL
[WACV 2025] ππ SpaGBOL: Spatial-Graph-Based Orientated Localisation π‘πΊοΈ
Install / Use
/learn @tavisshore/SpaGBOLREADME
ππ SpaGBOL: Spatial-Graph-Based Orientated Localisation π‘πΊοΈ
<p align="middle"> <a href="https://tavisshore.co.uk/">Tavis Shore</a> <a href="https://cvssp.org/Personal/OscarMendez/index.html">Oscar Mendez</a> <a href="https://personalpages.surrey.ac.uk/s.hadfield/biography.html">Simon Hadfield</a> </p> <p align="middle"> <a href="https://www.surrey.ac.uk/centre-vision-speech-signal-processing">Centre for Vision, Speech, and Signal Processing (CVSSP)</a> </p> <p align="middle"> <a>University of Surrey, Guildford, GU2 7XH, United Kingdom </a> </p>
π£ Download Dataset Now!
π Description
Cross-View Geo-Localisation within urban regions is challenging in part due to the lack of geo-spatial structuring within current datasets and techniques. We propose utilising graph representations to model sequences of local observations and the connectivity of the target location. Modelling as a graph enables generating previously unseen sequences by sampling with new parameter configurations. To leverage this newly available information, we propose a GNN-based architecture, producing spatially strong embeddings and improving discriminability over isolated image embeddings.
We release π SpaGBOL π, the first graph-based CVGL dataset, consisting of 10 city centre graph networks across the globe. This densely sampled structured dataset will progress the CVGL field towards real-world viability.
πΎ SpaGBOL: Graph-Based CVGL Dataset
SpaGBOL contains 98,855 panoramic streetview images across different seasons, and 19,771 corresponding satellite images from 10 mostly densely populated international cities. This translates to 5 panoramic images and one satellite image per graph node. Downloading instructions below.
The map below shows the cities contained in SpaGBOL v1, with the breadth and density being increased in subsequent versions.
π City Locations π¬π§π§πͺπΊπΈππ°πΈπ¬π―π΅
𧬠City Graph Representations
Here are a few of the city centre graph networks, where nodes represent road junctions, and edges represent the roads between junctions.
<p align="middle"> <table> <tr> <td> <img src="https://github.com/user-attachments/assets/864770d8-055e-410b-b034-448f2eb0e5d5" alt="London Graph"/> </td> <td> <img src="https://github.com/user-attachments/assets/2b6073f8-8fec-4fa9-993b-9cd5d5d3d218" alt="Manhattan Graph"/> </td> <td> <img src="https://github.com/user-attachments/assets/4c610cb6-1f8a-441a-adaa-b2147dd0bc9d" alt="Tokyo Graph"/> </td> </tr> <tr> <td style='text-align:center; vertical-align:middle'>City of London</td> <td style='text-align:center; vertical-align:middle'>Manhattan Centre</td> <td style='text-align:center; vertical-align:middle'>Tokyo Centre</td> </tr> </table> </p>πΈ Image Pair Examples
At each graph node, streetview and satellite images are collected at a ratio of 5:1 to improve training generalisation, here are some examples from across the globe.
<p align="middle"> <img src="https://github.com/user-attachments/assets/0905b94a-cb41-464d-8002-64807b4b9b85" width="32%" /> <img src="https://github.com/user-attachments/assets/b5031d46-f89f-474a-ad9c-84781a86e407" width="32%" /> <img src="https://github.com/user-attachments/assets/a8378aa0-0ad1-481f-86e0-912ff8e9ac94" width="32%" /> </p>πΆ Exhaustive / Random Depth-First Walk Generation
<div> <img align="left" width="50%" src="https://github.com/user-attachments/assets/6e9aba0f-8b5b-4eff-923f-513d8df1e33e">Graph Walk
Graph networks can be traversed using Breadth-First Search (BFS) or Depth-First Search (DFS). BFS explores level by level, visiting all neighbors of a node before moving deeper, using a queue. DFS dives into a branch fully before backtracking, often using a stack or recursion. BFS is ideal for shortest paths, while DFS suits tasks like cycle detection or exploring all paths.
Vehicle Walk
DFS relates to a vehicleβs movement by mimicking how a vehicle explores routes sequentially. This approach is useful for navigating unmapped areas or exploring all possible routes systematically. Reference sets contain exhaustive sampling of each node, retrieving any one of these random walks is deemed correct.
</div>π§° SpaGBOL: Benchmarking
π Environment Setup
conda env create -f requirements.yaml && conda activate spagbol
π Only Data Download
To download the SpaGBOL v1 dataset, set the desired configuration in src/utils/data.py and run the following:
python src/utils/data.py
This is a very slow process so may take multiple days. Multi-threaded downloading greatly increases the speed, however if you encounter connection errors - set multi_threaded to False for a while.
π SpaGBOL Training
To complete training, simply execute run.py - data will be downloaded if not present.
python run.py --data 'datapath' --fov 360 --walk 4
π SpaGBOL: Benchmark Results
<table class="tg"><thead> <tr> <th class="tg-dvpl">FOV</th> <th class="tg-c3ow" colspan="4">360Β°</th> <th class="tg-c3ow" colspan="4">180Β°</th> <th class="tg-c3ow" colspan="4">90Β°</th> </tr></thead> <tbody> <tr> <td class="tg-c3ow">Model</td> <td class="tg-c3ow">Top-1</td> <td class="tg-c3ow">Top-5</td> <td class="tg-c3ow">Top-10</td> <td class="tg-c3ow">Top-1%</td> <td class="tg-c3ow">Top-1</td> <td class="tg-c3ow">Top-5</td> <td class="tg-c3ow">Top-10</td> <td class="tg-c3ow">Top-1%</td> <td class="tg-c3ow">Top-1</td> <td class="tg-c3ow">Top-5</td> <td class="tg-c3ow">Top-10</td> <td class="tg-c3ow">Top-1%</td> </tr> <tr> <td class="tg-c3ow">CVM</td> <td class="tg-c3ow">2.87</td> <td class="tg-c3ow">12.96</td> <td class="tg-c3ow">21.51</td> <td class="tg-c3ow">28.33</td> <td class="tg-c3ow">2.68</td> <td class="tg-c3ow">9.83</td> <td class="tg-c3ow">15.12</td> <td class="tg-c3ow">20.23</td> <td class="tg-c3ow">1.02</td> <td class="tg-c3ow">5.87</td> <td class="tg-c3ow">10.15</td> <td class="tg-c3ow">14.81</td> </tr> <tr> <td class="tg-c3ow">CVFT</td> <td class="tg-c3ow">4.02</td> <td class="tg-c3ow">13.02</td> <td class="tg-c3ow">20.29</td> <td class="tg-c3ow">27.19</td> <td class="tg-c3ow">2.49</td> <td class="tg-c3ow">8.74</td> <td class="tg-c3ow">14.61</td> <td class="tg-c3ow">19.91</td> <td class="tg-c3ow">1.21</td> <td class="tg-c3ow">5.74</td> <td class="tg-c3ow">10.02</td> <td class="tg-c3ow">13.53</td> </tr> <tr> <td class="tg-c3ow">DSM</td> <td class="tg-c3ow">5.82</td> <td class="tg-c3ow">10.21</td> <td class="tg-c3ow">14.13</td> <td class="tg-c3ow">18.62</td> <td class="tg-c3ow">3.33</td> <td class="tg-c3ow">9.74</td> <td class="tg-c3ow">14.66</td> <td class="tg-c3ow">21.48</td> <td class="tg-c3ow">1.59</td> <td class="tg-c3ow">5.87</td> <td class="tg-c3ow">10.11</td> <td class="tg-c3ow">16.24</td> </tr> <tr> <td class="tg-c3ow">L2LTR</td> <td class="tg-c3ow">11.23</td> <td class="tg-c3ow">31.27</td> <td class="tg-c3ow">42.50</td> <td class="tg-c3ow">49.52</td> <td class="tg-c3ow">5.94</td> <td class="tg-c3ow">18.32</td> <td class="tg-c3ow">28.53</td> <td class="tg-c3ow">35.23</td> <td class="tg-c3ow">6.13</td> <td class="tg-c3ow">18.70</td> <td class="tg-c3ow">27.95</td> <td class="tg-c3ow">34.08</td> </tr> <tr> <td class="tg-c3ow">GeoDTR+</td> <td class="tg-c3ow">17.49</td> <td class="tg-c3ow">40.27</td> <td class="tg-c3ow">52.01</td> <td class="tg-c3ow">59.41</td> <td class="tg-c3ow">9.06</td> <td class="tg-c3ow">25.46</td> <td class="tg-c3ow">35.67</td> <td class="tg-c3ow">43.33</td> <td class="tg-c3ow">5.55</td> <td class="tg-c3ow">17.04</td> <td class="tg-c3ow">24.31</td> <td class="tg-c3ow">31.78</td> </tr> <tr> <td class="tg-c3ow">SAIG-D</td> <td class="tg-c3ow">25.65</td> <td class="tg-c3ow">51.44</td> <td class="tg-c3ow">62.29</td> <td class="tg-c3ow">68.22</td> <td class="tg-c3ow">15.12</td> <td class="tg-c3ow">35.55</td> <td class="tg-c3ow">45.63</td> <td class="tg-c3ow">53.10</td>Related Skills
node-connect
352.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
claude-opus-4-5-migration
111.5kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
frontend-design
111.5kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
model-usage
352.9kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
