BikeDNA: Bicycle Infrastructure Data & Network Assessment

This is the repository of BikeDNA, a tool for assessing the quality of OpenStreetMap (OSM) and other bicycle infrastructure data sets in a reproducible way. It provides planners, researchers, data maintainers, cycling advocates, and others who work with bicycle networks a detailed, informed overview of data quality in a given area.

Paper: https://journals.sagepub.com/doi/10.1177/23998083231184471

Running BikeDNA on large data sets? Consider using BikeDNA BIG.

<details><summary>Background</summary>

A fair amount of research projects on OpenStreetMap and other forms of volunteered geographic information (VGI) have already been conducted, but few focus explicitly on data on bicycle infrastructure. Doing so is however important because paths and tracks for cyclists and pedestrians often are mapped last and are more likely to have errors (Barron et al., 2014, Neis et al. 2012). Moreover, the spatial distribution of dips in data quality are often not random in crowdsourced data but correlate with population density and other characteristics of the mapped area (Forghani and Delavar, 2014). This necessitates a critical stance towards the data we use for our research and planning, despite the overall high quality of OSM.

Data quality covers a wide range of aspects. The conceptualization of data quality used here refers to fitness-for-purpose (Barron et al., 2014) - this means that data quality is interpreted as whether or not the data fulfils the user needs, rather than any universal definition of quality. BikeDNA has been developed to particularly support network-based research and planning, and therefore provides insights into the topological structure of the bicycle network apart from data coverage, while positional accuracy is not directly evaluted.

The purpose is not to give any final assessment of the data quality, but to highlight aspects that might be relevant for deciding whether the data for a given area is fit for use. While BikeDNA can make use of a reference data set to compare with OSM data, if reference data is available, the tool makes no assumption of which, if any, data set represents the true conditions. OSM data on bicycle infrastructure is often at a comparable or higher quality than governmental data sets, but the interpretation of differences between the two requires adequate knowledge of the local conditions.

</details>

Workflow

BikeDNA consists of Jupyter notebooks that analyze bicycle infrastructure data sets. It therefore requires an installation of Python, including tools for Jupyter notebook.

The I. Installation, II. Setup, III. Analysis, and IV. Create reports steps are illustrated in the figure and described in detail below. Dotted parts are optional.

The analysis is divided into 3 parts: OSM, analyzing OSM bicycle network data intrinsically, REFERENCE, analyzing non-OSM reference bicycle network data intrinsically, and COMPARE, for comparing OSM and reference data extrinsically.

I. Installation

First clone this repository (recommended) to your local machine or download it.

To avoid cloning the history and larger branches with example data and plots, use:

git clone -b main --single-branch https://github.com/anerv/BikeDNA --depth 1

Create Python conda environment

To ensure that all packages needed for the analysis are installed, it is recommended to create and activate a new conda environment using the environment.yml:

conda env create --file=environment.yml
conda activate bikedna

If this fails, the environment can be created by running:

conda config --prepend channels conda-forge
conda create -n bikedna --strict-channel-priority osmnx geopandas pandas networkx folium pyyaml matplotlib contextily jupyterlab haversine momepy nbconvert ipykernel
conda activate bikedna

This method does not control the library versions and should be used as a last resort.

The code for BikeDNA has been developed and tested using macOS 13.2.1.

Install package

The repository has been set up using the structure described in the Good Research Developer. Once the repository has been downloaded, navigate to the main folder in a terminal window and run the command

pip install -e .

Lastly, add the environment kernel to Jupyter via:

python -m ipykernel install --user --name=bikedna

Run Jupyter Lab or Notebook with kernel bikedna (Kernel > Change Kernel > bikedna).

Demo

After the installation steps:

For an example of results that BikeDNA can produce, see a demo PDF output here: report.pdf
For an example of how BikeDNA can be used, run the notebooks on the branch GeoDanmark without changing the default parameters. This will analyze an area around Copenhagen, Denmark using a local reference data set.

II. Setup

Fill out the configuration file

In order to run the code, the configuration file config.yml must be filled out - see the branch 'GeoDanmark' for an example. The configuration file contains a range of settings needed for adapting the analysis to different areas and types of reference data. The study area name provided in the configuration file will be used by BikeDNA for folder structure setup, plot naming, and result labelling.

Plot settings can be changed in scripts/settings/plotting.py.

Set up the folder structure

Next, to create the required folder structure, navigate to the main folder in a terminal window and run the Python file setup_folders.py

python setup_folders.py

This should return:

Successfully created folder data/osm/'my_study_area'/
Successfully created folder data/reference/'my_study_area'/
Successfully created folder data/compare/'my_study_area'/
...

Provide/Prepare data sets

Once the folders have been created, provide:

a polygon defining the study area
for the extrinsic analysis (optional): a reference data set

For requirement details see: Data set requirements for BikeDNA

For an example of how to prepare data sets, see the notebooks in the scripts/examples folder.

III. Analysis

Notebooks

All analysis notebooks are in the scripts folder.

Warning The two intrinsic OSM and REFERENCE analyses can be run independently, but they must both be run before the extrinsic COMPARE analysis.

OSM

1a_initialize_osm: This notebook downloads data from OSM for the user-defined study area and processes it to the format needed in the analysis.
1b_intrinsic_analysis_osm: The intrinsic analysis evaluates the quality of the OSM data in the study area from the perspective of bicycle planning and research. This evaluation includes, for example, missing tags, disconnected components, and network gaps. Intrinsic means that the dat set is analyzed for itself, without being compared to other data.

REFERENCE

2a_initialize_reference: This notebook processes the reference data provided by the user to the format needed in the analysis.
2b_intrinsic_analysis_reference: The intrinsic analysis evaluates the quality of the reference data set in the study area from the perspective of bicycle planning and research. This evaluation includes, for example, disconnected components and network gaps. Intrinsic means that the data set is analyzed for itself, without being compared to other data.

COMPARE

3a_extrinsic_analysis_metrics: The extrinsic analysis compares the results computed in the intrinsic analysis of the OSM and reference data. The analysis considers for example differences in network density and structure, and differing connectivity across the study area.
3b_extrinsic_analysis_feature_matching: This notebook contains a functionality for matching corresponding features in the reference and OSM data. This step is computationally expensive, but provides an excellent overview of different geometries and/or errors of missing or excess data.

Run analysis

After completing all installation and setup steps, the analysis notebooks can be run. The notebooks for intrinsic analysis of OSM and reference data are independent from each other and can be run separately.

For intrinsic analysis of OSM data: run 1a, then 1b from the scripts/OSM folder
For intrinsic analysis of reference data: run 2a, then 2b from the scripts/REFERENCE folder
For an extrin

BikeDNA

Install / Use

README