Robosat
Semantic segmentation on aerial and satellite imagery. Extracts features such as: buildings, parking lots, roads, water, clouds
Install / Use
/learn @mapbox/RobosatREADME
Note: Robosat is neither maintained not actively developed any longer by Mapbox. See this issue.
The main developers (@daniel-j-h, @bkowshik) are no longer with Mapbox.
Table of Contents
Overview
RoboSat is an end-to-end pipeline written in Python 3 for feature extraction from aerial and satellite imagery. Features can be anything visually distinguishable in the imagery for example: buildings, parking lots, roads, or cars.
Have a look at
- this OpenStreetMap diary post where we first introduced RoboSat and show some results
- this OpenStreetMap diary post where we extract building footprints based on drone imagery in Tanzania
- this OpenStreetMap diary post where we summarize the v1.1 release
- this OpenStreetMap diary post where we summarize the v1.2 release
- this OpenStreetMap diary post where we run robosat v1.2 on aerial imagery for Bavaria, Germany
- this OpenStreetMap diary post where Maning runs robosat on imagery from the Philippines
The tools RoboSat comes with can be categorized as follows:
- data preparation: creating a dataset for training feature extraction models
- training and modeling: segmentation models for feature extraction in images
- post-processing: turning segmentation results into cleaned and simple geometries
Tools work with the Slippy Map tile format to abstract away geo-referenced imagery behind tiles of the same size.

The data preparation tools help you with getting started creating a dataset for training feature extraction models. Such a dataset consists of aerial or satellite imagery and corresponding masks for the features you want to extract. We provide convenient tools to automatically create these datasets downloading aerial imagery from the Mapbox Maps API and generating masks from OpenStreetMap geometries but we are not bound to these sources.

The modelling tools help you with training fully convolutional neural nets for segmentation. We recommend using (potentially multiple) GPUs for these tools: we are running RoboSat on AWS p2/p3 instances and GTX 1080 TI GPUs. After you trained a model you can save its checkpoint and run prediction either on GPUs or CPUs.

The post-processing tools help you with cleaning up the segmentation model's results. They are responsible for denoising, simplifying geometries, transforming from pixels in Slippy Map tiles to world coordinates (GeoJSON features), and properly handling tile boundaries.
If this sounds almost like what you need, see the extending section for more details about extending RoboSat. If you want to contribute, see the contributing section for more details about getting involved with RoboSat.
Installation
We provide pre-built Docker images for both CPU as well as GPU environments on Docker Hub in mapbox/robosat.
Using a CPU container to show all available sub-commands
docker run -it --rm -v $PWD:/data --ipc=host --network=host mapbox/robosat:latest-cpu --help
Using a GPU container (requires nvidia-docker on the host) to train a model
docker run --runtime=nvidia -it --rm -v $PWD:/data --ipc=host mapbox/robosat:latest-gpu train --model /data/model.toml --dataset /data/dataset.toml --workers 4
Arguments
--runtime=nvidiaenables the nvidia-docker runtime for access to host GPUs--ipc=hostis required for shared memory communication between workers--network=hostis required for network communication in the download tool-v $PWD:/datamakes the current directory on the host accessible at/datain the container
For installation from source (requires installing dependencies) see the Dockerfiles in the docker/ directory.
Usage
The following describes the tools making up the RoboSat pipeline. All tools can be invoked via
./rs <tool> <args>
Also see the sub-command help available via
./rs --help
./rs <tool> --help
Most tools take a dataset or model configuration file. See examples in the configs directory.
You will need to adapt these configuration files to your own dataset, for example setting your tile resolution (e.g. 256x256 pixel).
You will also need to adapt these configuration files to your specific deployment setup, for example using CUDA and setting batch sizes.
rs extract
Extracts GeoJSON features from OpenStreetMap to build a training set from.
The result of rs extract is a GeoJSON file with the extracted feature geometries.
The rs extract tool walks OpenStreetMap .osm.pbf base map files (e.g. from Geofabrik) and gathers feature geometries.
These features are for example polygons for parking lots, buildings, or roads.
rs cover
Generates a list of tiles covering GeoJSON features to build a training set from.
The result of rs cover is a file with tiles in (x, y, z) Slippy Map tile format covering GeoJSON features.
The rs cover tool reads in the GeoJSON features generated by rs extract and generates a list of tiles covering the feature geometries.
rs download
Downloads aerial or satellite imagery from a Slippy Map endpoint (e.g. the Mapbox Maps API) based on a list of tiles.
The result of rs download is a Slippy Map directory with aerial or satellite images - the training set's images you will need for the model to learn on.
The rs download tool downloads images for a list of tiles in (x, y, z) Slippy Map tile format generated by rs cover.
The rs download tool expects a Slippy Map endpoint where placeholders for {x}, {y}, and {z} are formatted with each tile's ids.
For example, for the Mapbox Maps API: https://api.mapbox.com/v4/mapbox.satellite/{z}/{x}/{y}@2x.webp?access_token=TOKEN.
rs rasterize
Rasterizes GeoJSON features into mask images based on a list of tiles.
The result of rs rasterize is a Slippy Map directory with masks - the training set's masks you will need for the model to learn on.
The rs rasterize tool reads in GeoJSON features and rasterizes them into single-channel masks with a color palette attached for quick visual inspection.
rs train
Trains a model on a training set made up of (image, mask) pairs.
The result of rs train is a checkpoint containing weights for the trained model.
The rs train tool trains a fully convolutional neural net for semantic segmentation on a dataset with (image, mask) pairs generated by rs download and rs rasterize.
We recommend using a GPU for training: we are working with the AWS p2 instances and GTX 1080 TI GPUs.
Before you can start training you need the following.
-
You need a dataset which you should split into three parts: training and validation for
rs trainto train on and to calculate validation metrics on and a hold-out dataset for final model evaluation. The dataset's directory need to look like the following.dataset ├── training │ ├── images │ └── labels └── validation ├── images └── labels -
You need to calculate label class weights with
rs weightson the training set's labels -
You need to add the path to the dataset's directory and the calculated class weights and statistics to the dataset config.
rs export
Exports a trained model in ONNX format for prediction across different backends (like Caffe2, TensorFlow).
The result of rs export is an ONNX GraphProto .pb file which can be used with the ONNX ecosystem.
Note: the rs predict tool works with .pth checkpoints. In contrast to these .pth checkpoints the ONNX models neither depend on PyTorch or the Python code for the model class and can be used e.g. in resource constrained environments like AWS Lambda.
rs predict
Predicts class probab
Related Skills
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
isf-agent
a repo for an agent that helps researchers apply for isf funding
last30days-skill
17.2kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
