SkillAgentSearch skills...

GeNSIT

Generating Neural Spatial Interaction Tables

Install / Use

/learn @YannisZa/GeNSIT

README

GeNSIT: Generating Neural Spatial Interaction Tables

License Python

WATCH OUR VIDEO EXPLAINER HERE

Quick Start: We recommended going through sections on Installation and Run if you wish to run GeNSIT using default settings.

Introduction

<img src="./gensit_motivation.jpg" alt="motivation" width="1000"/>

Tip Watch our video explainer here!.

Motivation

High-resolution complex simulators such as agent-based models (ABMs) are increasingly deployed to assist policymaking in transportation , social sciences, and epidemiology. They simulate individual agent interactions governed by stochastic dynamic systems, giving rise to an aggregate, in a mean field sense, continuous emergent structure. This is achieved by computationally expensive forward simulations, which hinders ABM parameter calibration and large-scale testing of multiple policy scenarios. Considering ABMs for the COVID-19 pandemic as an example, the continuous mean field process corresponds to the spatial intensity of the infections which is noisily observed at some spatial aggregation level, while the individual and discrete human contact interactions that give rise to that intensity are at best partially observed or fully latent. In transportation and mobility, running examples in this work, the continuous mean field process corresponds to the spatial intensity of trips arising from unobserved individual agent trips between discrete sets of origin and destination locations.

The formal object of interest that describes the discrete count of these spatial interactions, e.g. agent trips between locations, is the origin-destination matrix (ODM). It is an $I\times J$ (two-way) contingency table $\mathbf{T}$ with elements $T_{i,j} \in \mathbb{N}$ counting the interactions of two spatial categorical variables $i,j \in \mathbb{N}_{>0}$, see Figure above. It is typically sparse due to infeasible links between origin and destination locations, and partially observed through summary statistics -- such as table row and/or column marginals -- due to privacy concerns, data availability, and data collection costs. Operating at the discrete ODM level and learning this latent contingency table from summary statistics is vital for handling high-resolution spatial constraints and partial observations such as the total number of agents interacting between a pair of locations. It is also necessary for population synthesis in ABMs, which is performed prior to simulation in order to reduce the size of the ABM's parameter space. Moreover, it avoids errors and biases due to ad-hoc discretisation required when working with continuous approximations of the underlying discrete ODM $\mathbf{T}^*$.

Contribution

This repository introduces a computational framework named GeNSIT see for exploring the constrained discrete origin-destination matrices of agent trip location choices using closed-form or Gibbs Markov Basis sampling. The underlying continuous choice probability or intensity function (unnormalised probability function) is modelled by total and singly constrained spatial interaction models (SIMs) or gravity models embedded in the well-known Harris Wilson stochastic differential equations (SDEs). We employ Neural Networks to calibrate the SIM parameters. We include Markov Chain Monte Carlo (MCMC) schemes leveraged to learn the SIM parameters in previous works. For more details on the mathematical aspects of this repository please look at the Publications section.

Related publications

Zachos, Ioannis, Theodoros Damoulas, et al. ‘Table Inference for Combinatorial Origin-Destination Choices in Agent-Based Population Synthesis’. Stat, vol. 13, no. 1, 2024, p. e656, https://doi.org/10.1002/sta4.656. <a href="./zachos_stat.bib" style="text-decoration: none;" download="./zachos_stat.bib"> <img src="https://img.shields.io/badge/Export-BibTeX-orange" alt="stat"> </a>

Zachos, Ioannis, Mark Girolami, et al. Generating Origin-Destination Matrices in Neural Spatial Interaction Models. no. arXiv:2410.07352, arXiv, Oct. 2024, https://doi.org/10.48550/arXiv.2410.07352. arXiv. <a href="./zachos_nips.bib" style="text-decoration: none;" download="./zachos_nips.bib"> <img src="https://img.shields.io/badge/Export-BibTeX-orange" alt="nips"> </a>

Back to Table of Contents ⬆

Installation

Assuming Python >=3.9.7 and git are installed, clone this repository by running

git clone git@github.com:[REPONAME]/GeNSIT.git

Once available locally, navigate to the main folder as follows:

cd GeNSIT

Tip: We recommended running GeNSIT on a Docker container if you do not plan to make any code changes.

Docker

This section assumes Docker has been installed on your machine. Please follow this guide if you wish to install Docker. Build the docker image image

docker build -t "gensit" .

Once installed, make sure everything is working by running

docker run gensit --help

OSX

This section assumes anaconda or miniconda has been installed on your machine. Please follow this or this guide if you wish to install either of them. Then, run:

conda create -y -n gensit python=3.9.7
conda activate gensit
conda install -y -c conda-forge --file requirements.txt
conda install -y conda-build
python3 setup.py develop

Otherwise, make sure you install the gensit command line tool and its dependencies by running

pip3 install -e .

Validate installation

You can ensure that the dependencies have been successfully installed by running:

gensit --help

You should get a print statement like this:

Usage: gensit [OPTIONS] COMMAND [ARGS]...

  Command line tool for Generating Neural Spatial Interaction Tables (origin-
  destination matrices)

Options:
  --help  Show this message and exit.

Commands:
  create     Create synthetic data for spatial interaction table and...
  plot       Plot experimental outputs.
  reproduce  Reproduce figures in the paper.
  run        Sample discrete spatial interaction tables...
  summarise  Create tabular summary of metadata, metrics computed for...

Throughout the remainder of this readme we illustrate GeNSIT's command line tool capabilities assuming that a docker container has been installed.

Back to Table of Contents ⬆

Inputs

Inputs to GeNSIT are data and configuration files.

Data

The minimum data requirements include:

  • A set of origin and destination locations between which agents travel.
  • A cost matrix $\mathbf{C}$ reflecting inconvenience of travel from any origin to any destination. This can be distance and/or time dependent (e.g. Euclidean distance and/or travel times).
  • A measure of destination attractiveness $\mathbf{z}$. This depends on the types of trips agents make e.g. for work trips this would be number of jobs available at each destination.
  • The total number of agents/trips $M$. Each agent performs exactly one trip.

Optional datasets may be:

  • Origin and/or destination demand.
  • Partially observed trips between selected origin-destination pairs.
  • Total distance and/or time agents have travelled by origin and/or destination location.
  • A transportation network/graph.
  • A ground truth agent trip table to validate your model.

Real-world

We consider agent trips from residence to workplace locations in Cambridge, UK. We use the following datasets from the Census 2011 data provided by the Office of National Statistics:

  • <a href="../data/inputs/cambridge/lsoas_to_msoas.geojson" target="_blank">Lower super output areas (LSOAs), Middle super output areas (MSOAs)</a> as origin, destination locations, respectively.
  • <a href="../data/inputs/cambridge/cost_matrices/clustered_facilities_sample_20x20_20_01_2023_sample_20x20_clustered_facilities_ripleys_k_500_euclidean_points%_prob_origin_destination_adjusted_normalised_boundary_only_edge_corrected_cost_matrix_max_normalised.txt" target="_blank">Average shortest path in a transportation network</a> between a random sample of 20 residences inside each LSOA and 20 workplaces inside each MSOA as a cost matrix.
  • Number of jobs available at each MSOA as a destination attraction proxy used in the NN's loss function.
  • Total distance travelled to work from each LSOA as an input to the NN's loss function.
  • Ground truth agent trip table a validation dataset. Parts of this table such as origin/destination de
View on GitHub
GitHub Stars7
CategoryDevelopment
Updated3mo ago
Forks2

Languages

Python

Security Score

72/100

Audited on Dec 7, 2025

No findings