Stac2cube
STAC catalogs to Analysis-Ready Data Cubes - Project by EORC @Uni_Würzburg
Install / Use
/learn @BaturalpArisoy/Stac2cubeREADME
stac2cube <br> STACs to Analysis-Ready Data Cubes
- If you use stac2cube in your research, you are kindly asked to cite it. Thank you! <br> See: Citation
- Free software: Apache 2.0
- This software is designed to function on any local-machine and also HPC system using SLURM jobs.
Table of Contents
- Feature Overview
- Installation
- How to run
- How to run on HPC
- Access and Licensing Details for STAC Catalogs
- Method References
- Citation
Feature Overview
stac2cube converts SpatioTemporal Asset Catalogs (STAC) into Analysis-Ready Data (ARD) cubes for efficient Earth Observation (EO) processing.
For Sentinel-2, the ARD cubes are built with three main components:
-
Cloud masking based on user-defined thresholds. This lets users control how strict cloud detection should be and export multiple cloud-masked cubes. Traditional options like filtering by max_cc (STAC metadata) and masking with the Scene Classification Layer (SCL) are also supported for faster processing.
-
Co-registration to reduce scene-to-scene X/Y misalignment (often around 1-2 pixels). Small sub-pixel shifts (below 10 m) can still remain.
-
Super-resolution of both 10-meters and 20-meters bands to 2.5 m.
The result is a data cube that is cloud-masked with customizable thresholds, spatially aligned across time, and available at higher spatial resolution. Details about the underlying algorithms and how to cite the used third-party tools can be found in the Examples section.
Below is an example of 2 animations showing before and after ARD cube generation.
<div align="center"> <h2>Before (Initial Data Cube)</h2> <a href="https://github.com/user-attachments/assets/d6458ba1-6112-4127-899e-9fa06ce58772"> <img src="https://github.com/user-attachments/assets/d6458ba1-6112-4127-899e-9fa06ce58772" alt="Initial Data Cube"> </a> </div> <br> <div align="center"> <h2>After (Co-registered and Super Resolved Data Cube)</h2> <a href="https://github.com/user-attachments/assets/529402c7-4ecc-4344-b63b-409aee94e3c9"> <img src="https://github.com/user-attachments/assets/529402c7-4ecc-4344-b63b-409aee94e3c9" alt="Co-registered and Super-resolved Data Cube"> </a> </div><br><br>
Installation
Installation is possible with package managers like Micromamba & Anaconda.<br>
Following steps are example how to install with Micromamba or Anaconda.<br><br>
Step 1: Clone the repository to your current working directory
$ git clone https://github.com/BaturalpArisoy/stac2cube.git
If git is not available for you, download and unzip the file: https://github.com/BaturalpArisoy/stac2cube/archive/refs/heads/main.zip
Step 2: Change directory to cloned stac2cube folder
$ cd "path/to/stac2cube/"
environment.yml file should be present in this path, please double check.
Step 3: Install stac2cube via Micromamba or Anaconda Prompt (this might take a while!)
a) LINUX
$ micromamba env create -n stac2cube -f environment.yml
b1) WINDOWS Micromamba
$ micromamba env create -n stac2cube -f environment.yml; micromamba install -n stac2cube -c conda-forge vs2015_runtime
b2) WINDOWS Anaconda Prompt
$ conda env create -n stac2cube -f environment.yml && conda activate stac2cube && conda install -c conda-forge vs2015_runtime
How to run
Interactive User Interface on Jupyter Notebook:
For a quick and beginner-friendly workflow, use the 3 interactive GUI tools available in the User Interface Tools. <br>
- Data Cube Builder
- Data Cube Editor (see example below)
- Analysis Ready Data Cube Tools (Probabilistic Cloud Masking, Co-registration and Super-resolution)<br><br>
Step-by-step Interactive Notebooks
For a more detailed walkthrough of stac2cube features, including background, processing steps, and storage, see the well-documented notebooks in the interactive folder.
Each step is documented by the numbers and the general explanation is given below:
- Initial Data Cube
- Collects images from STAC catalogs for the selected mission based on users parameters.
- Generates multi-dimensional data cubes, suitable for time-series.
- The data cubes can be updated anytime without generating them from the scratch.
- Available missions: Sentinel-2 L2A, Sentinel-2 L1C, Sentinel-1 RTC, Landsat C2 L2, COP DEM Glo-30 (single time)
- Cloud Mask Data Cube
- The result contains cloud probability maps and user defined binary cloud mask layers of time-series.
- When selected, clouds from the initial data cube are automatically masked out.
- Can be updated anytime.
- Co-register Data Cube
- Fix the global X/Y shift between consecutive Sentinel-2 items.
- IMPORTANT: Please read notes in the notebook for better quality results.
- Super-resolve Data Cube
- Super resolves both 10-meters and 20-meters bands to 2.5-meters. ["blue", "green", "red", "nir", "nir08", "rededge1", "rededge2", "rededge3", "swir16", "swir22"] for the entire Sentinel-2 data cube time-series.
- Batch Processing (under development!)
- (when completed) If the user knows what parameters to use for each function above, can set batch processing instead of using each step separately :)
How to run on HPC
A documentation file on how to use stac2cube features on terrabyte's HPC for compute-intensive processes and for faster processing time can be found in the slurm folder. Don't forget to look at how_to_use.txt.
Access and Licensing Details for STAC Catalogs
Access to STAC Catalogs
- Important: terrabyte STAC catalogs can be only computed when working on a terrabyte environment.<br>
- However, stac2cube package is designed to work on both local-machine without terrabyte connection and within terrabyte HPC environment.<br>
- Therefore, a silent parameter will enable terrabyte STAC catalogs when a SLURM job is activated.<br>
- The default set-up (terrabyte disabled) will feature STAC catalogs that provide "open-access data" (not open-source).<br>
- Thus, note that stac2cube package can not guarantee unlimited access to these open-access data catalogs in the future!
STAC Catalog Licenses
| Provider | Service | STAC API | License | Open-Access | Open-Source | |------------|-------------------|-------------------------------------------------|----------------------------------------------------------------------------------------------|-------------|-------------| | DLR | terrabyte | https://stac.terrabyte.lrz.de/public/api/ | MIT License Copyright (c) 2024 Deutsches Zentrum für Luft- und Raumfahrt e.V. | No | No | | Element 84 | Earth Search | https://earth-search.aws.element84.com/v1/ | Apache License 2.0 | Yes | Yes | | Microsoft | Planetary Computer| https://planetarycomputer.microsoft.com/api/stac/v1 | MIT License Copyright (c) Microsoft Corporation. | Yes | No |
Why use terrabyte then?
Why do terraybte users collect data from terrabyte STAC catalog instead of open-source Earth Search?
- The data by Element 84 is stored in AWS S3 services.
- The data by DLR is stored in the servers of The Leibniz Supercomputing Centre (LRZ) in Garching/Munich.
- When working on a terrabyte environment, the data query is returned from same server instead of connecting to AWS. <br><br>
Example: Query for Sentinel-2 L2A:
- daterange: ["2017-01-01", "2025-03-28"]
- polygon: Nord Hubland/Würzburg/Germany<br>
| Service | Returned Date | Processing Time (s)| |------------------|----------------|--------------------| |terrabyte | 1134 | 24.0 | |Earth Search | 1038 | 140.5 | |Planetary Computer| 1133 | 12.2 |
- Indicates* that queries are faster when working on a terrabyte environment.
- Most importantly, this indicates that Earth Search archive has some missing scenes.
- Also Earth Search STAC definitions are sometimes faulty (especially Sentinel-2 L1C) and as a developer of this package, I prefer working with terrabyte API.
* Queries are iterated 10 times per each service and the average time per run is calculated (timeit module).
Method References
-
Cloud Mask Data Cube applies s2cloudless by Sentinel Hub - CC-BY-SA-4.0 license.
-
Co-register Data Cube applies AROSICS by Daniel Scheffler - Apache-2.0 license.
Daniel Scheffler. (2017, July 3). AROSICS: An Automated and Robust Open-Source Image Co-Registration Soft
