SkillAgentSearch skills...

Stac2cube

STAC catalogs to Analysis-Ready Data Cubes - Project by EORC @Uni_Würzburg

Install / Use

/learn @BaturalpArisoy/Stac2cube
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<img src="assets/stac2cube_logo.png" alt="stac2cube logo" width="300">

stac2cube <br> STACs to Analysis-Ready Data Cubes

Preprint DOI DOI License: Apache-2.0

  • If you use stac2cube in your research, you are kindly asked to cite it. Thank you! <br> See: Citation
  • Free software: Apache 2.0
  • This software is designed to function on any local-machine and also HPC system using SLURM jobs.

Table of Contents

Feature Overview

stac2cube converts SpatioTemporal Asset Catalogs (STAC) into Analysis-Ready Data (ARD) cubes for efficient Earth Observation (EO) processing.

For Sentinel-2, the ARD cubes are built with three main components:

  • Cloud masking based on user-defined thresholds. This lets users control how strict cloud detection should be and export multiple cloud-masked cubes. Traditional options like filtering by max_cc (STAC metadata) and masking with the Scene Classification Layer (SCL) are also supported for faster processing.

  • Co-registration to reduce scene-to-scene X/Y misalignment (often around 1-2 pixels). Small sub-pixel shifts (below 10 m) can still remain.

  • Super-resolution of both 10-meters and 20-meters bands to 2.5 m.

The result is a data cube that is cloud-masked with customizable thresholds, spatially aligned across time, and available at higher spatial resolution. Details about the underlying algorithms and how to cite the used third-party tools can be found in the Examples section.

Below is an example of 2 animations showing before and after ARD cube generation.

<div align="center"> <h2>Before (Initial Data Cube)</h2> <a href="https://github.com/user-attachments/assets/d6458ba1-6112-4127-899e-9fa06ce58772"> <img src="https://github.com/user-attachments/assets/d6458ba1-6112-4127-899e-9fa06ce58772" alt="Initial Data Cube"> </a> </div> <br> <div align="center"> <h2>After (Co-registered and Super Resolved Data Cube)</h2> <a href="https://github.com/user-attachments/assets/529402c7-4ecc-4344-b63b-409aee94e3c9"> <img src="https://github.com/user-attachments/assets/529402c7-4ecc-4344-b63b-409aee94e3c9" alt="Co-registered and Super-resolved Data Cube"> </a> </div>

<br><br>

Installation

Installation is possible with package managers like Micromamba & Anaconda.<br>

Following steps are example how to install with Micromamba or Anaconda.<br><br>

Step 1: Clone the repository to your current working directory

$ git clone https://github.com/BaturalpArisoy/stac2cube.git

If git is not available for you, download and unzip the file: https://github.com/BaturalpArisoy/stac2cube/archive/refs/heads/main.zip

Step 2: Change directory to cloned stac2cube folder

$ cd "path/to/stac2cube/"

environment.yml file should be present in this path, please double check.

Step 3: Install stac2cube via Micromamba or Anaconda Prompt (this might take a while!)

a) LINUX

$ micromamba env create -n stac2cube -f environment.yml

b1) WINDOWS Micromamba

$ micromamba env create -n stac2cube -f environment.yml; micromamba install -n stac2cube -c conda-forge vs2015_runtime

b2) WINDOWS Anaconda Prompt

$ conda env create -n stac2cube -f environment.yml && conda activate stac2cube && conda install -c conda-forge vs2015_runtime

How to run

Interactive User Interface on Jupyter Notebook:

For a quick and beginner-friendly workflow, use the 3 interactive GUI tools available in the User Interface Tools. <br>

  1. Data Cube Builder
  2. Data Cube Editor (see example below)
  3. Analysis Ready Data Cube Tools (Probabilistic Cloud Masking, Co-registration and Super-resolution)<br><br>
<img src="assets/data_cube_editor_GUI.png" alt="gui_editor">

Step-by-step Interactive Notebooks

For a more detailed walkthrough of stac2cube features, including background, processing steps, and storage, see the well-documented notebooks in the interactive folder.

Each step is documented by the numbers and the general explanation is given below:

  1. Initial Data Cube
    • Collects images from STAC catalogs for the selected mission based on users parameters.
    • Generates multi-dimensional data cubes, suitable for time-series.
    • The data cubes can be updated anytime without generating them from the scratch.
    • Available missions: Sentinel-2 L2A, Sentinel-2 L1C, Sentinel-1 RTC, Landsat C2 L2, COP DEM Glo-30 (single time)
  2. Cloud Mask Data Cube
    • The result contains cloud probability maps and user defined binary cloud mask layers of time-series.
    • When selected, clouds from the initial data cube are automatically masked out.
    • Can be updated anytime.
  3. Co-register Data Cube
    • Fix the global X/Y shift between consecutive Sentinel-2 items.
    • IMPORTANT: Please read notes in the notebook for better quality results.
  4. Super-resolve Data Cube
    • Super resolves both 10-meters and 20-meters bands to 2.5-meters. ["blue", "green", "red", "nir", "nir08", "rededge1", "rededge2", "rededge3", "swir16", "swir22"] for the entire Sentinel-2 data cube time-series.
  5. Batch Processing (under development!)
    • (when completed) If the user knows what parameters to use for each function above, can set batch processing instead of using each step separately :)

How to run on HPC

A documentation file on how to use stac2cube features on terrabyte's HPC for compute-intensive processes and for faster processing time can be found in the slurm folder. Don't forget to look at how_to_use.txt.

Access and Licensing Details for STAC Catalogs

Access to STAC Catalogs

  • Important: terrabyte STAC catalogs can be only computed when working on a terrabyte environment.<br>
  • However, stac2cube package is designed to work on both local-machine without terrabyte connection and within terrabyte HPC environment.<br>
  • Therefore, a silent parameter will enable terrabyte STAC catalogs when a SLURM job is activated.<br>
  • The default set-up (terrabyte disabled) will feature STAC catalogs that provide "open-access data" (not open-source).<br>
  • Thus, note that stac2cube package can not guarantee unlimited access to these open-access data catalogs in the future!

STAC Catalog Licenses

| Provider | Service | STAC API | License | Open-Access | Open-Source | |------------|-------------------|-------------------------------------------------|----------------------------------------------------------------------------------------------|-------------|-------------| | DLR | terrabyte | https://stac.terrabyte.lrz.de/public/api/ | MIT License Copyright (c) 2024 Deutsches Zentrum für Luft- und Raumfahrt e.V. | No | No | | Element 84 | Earth Search | https://earth-search.aws.element84.com/v1/ | Apache License 2.0 | Yes | Yes | | Microsoft | Planetary Computer| https://planetarycomputer.microsoft.com/api/stac/v1 | MIT License Copyright (c) Microsoft Corporation. | Yes | No |

Why use terrabyte then?

Why do terraybte users collect data from terrabyte STAC catalog instead of open-source Earth Search?

  • The data by Element 84 is stored in AWS S3 services.
  • The data by DLR is stored in the servers of The Leibniz Supercomputing Centre (LRZ) in Garching/Munich.
  • When working on a terrabyte environment, the data query is returned from same server instead of connecting to AWS. <br><br>

Example: Query for Sentinel-2 L2A:

  • daterange: ["2017-01-01", "2025-03-28"]
  • polygon: Nord Hubland/Würzburg/Germany<br>

| Service | Returned Date | Processing Time (s)| |------------------|----------------|--------------------| |terrabyte | 1134 | 24.0 | |Earth Search | 1038 | 140.5 | |Planetary Computer| 1133 | 12.2 |

  • Indicates* that queries are faster when working on a terrabyte environment.
  • Most importantly, this indicates that Earth Search archive has some missing scenes.
  • Also Earth Search STAC definitions are sometimes faulty (especially Sentinel-2 L1C) and as a developer of this package, I prefer working with terrabyte API.

* Queries are iterated 10 times per each service and the average time per run is calculated (timeit module).

Method References

  1. Cloud Mask Data Cube applies s2cloudless by Sentinel Hub - CC-BY-SA-4.0 license.

  2. Co-register Data Cube applies AROSICS by Daniel Scheffler - Apache-2.0 license.

    Daniel Scheffler. (2017, July 3). AROSICS: An Automated and Robust Open-Source Image Co-Registration Soft

View on GitHub
GitHub Stars15
CategoryDevelopment
Updated2d ago
Forks1

Languages

Jupyter Notebook

Security Score

95/100

Audited on Apr 4, 2026

No findings