Gdalcubes
Creating and analyzing Earth observation data cubes in R
Install / Use
/learn @appelmar/GdalcubesREADME
gdalcubes <img src="man/figures/logo.svg" align="right" alt="" width="120" />
The R package gdalcubes aims at making analyses of large satellite
image collections easier, faster, more intuitive, and more interactive.
The package represents the data as regular raster data cubes with
dimensions bands, time, y, and x and hides complexities in the
data due to different spatial resolutions,map projections, data formats,
and irregular temporal sampling.
Features
- Read and process multitemporal, multispectral Earth observation image collections as regular raster data cubes by applying on-the-fly reprojection, rescaling, cropping, and resampling.
- Work with existing Earth observation imagery on local disks or cloud storage without the need to maintain a 2nd copy of the data.
- Apply user-defined R functions on data cubes.
- Execute data cube operation chains using parallel processing and lazy evaluation.
Among others, the package has been successfully used to process data from the Sentinel-2, Sentinel-5P, Landsat, PlanetScope, MODIS, and Global Precipitation Measurement Earth observation satellites / missions.
Installation
Install from CRAN with:
install.packages("gdalcubes")
From sources
Installation from sources is easiest with
remotes::install_git("https://github.com/appelmar/gdalcubes")
Please make sure that the git command line client is available on your system. Otherwise, the above command might not clone the gdalcubes C++ library as a submodule under src/gdalcubes.
The package builds on the external libraries GDAL, NetCDF, SQLite, and curl.
Windows
On Windows, you will need Rtools to build the package from sources.
Linux
Please install the system libraries e.g. with the package manager of your Linux distribution. Also make sure that you are using a recent version of GDAL (>2.3.0). On Ubuntu, the following commands will install all neccessary libraries.
sudo add-apt-repository ppa:ubuntugis/ppa && sudo apt-get update
sudo apt-get install libgdal-dev libnetcdf-dev libcurl4-openssl-dev libsqlite3-dev libudunits2-dev
MacOS
Using Homebrew, required system libraries can be installed with
brew install pkg-config
brew install gdal
brew install netcdf
brew install libgit2
brew install udunits
brew install curl
brew install sqlite
brew install libtiff
brew install hdf5
brew install protobuf
Getting started
Download example data
if (!dir.exists("L8_Amazon")) {
download.file("https://hs-bochum.sciebo.de/s/8XcKAmPfPGp2CYh/download", destfile = "L8_Amazon.zip",mode = "wb")
unzip("L8_Amazon.zip", exdir = "L8_Amazon")
}
Creating an image collection
At first, we must scan all available images once, and extract some
metadata such as their spatial extent and acquisition time. The
resulting image collection is stored on disk, and typically consumes a
few kilobytes per image. Due to the diverse structure of satellite image
products, the rules how to derive the required metadata are formalized
as collection_formats. The package comes with predefined formats for
some Sentinel, Landsat, and MODIS products (see collection_formats()
to print a list of available formats).
library(gdalcubes)
gdalcubes_options(parallel=8)
files = list.files("L8_Amazon", recursive = TRUE,
full.names = TRUE, pattern = ".tif")
length(files)
## [1] 1805
sum(file.size(files)) / 1024^2 # MiB
## [1] 1919.12
L8.col = create_image_collection(files, format = "L8_SR", out_file = "L8.db")
L8.col
## Image collection object, referencing 180 images with 10 bands
## Images:
## name left top bottom
## 1 LC08_L1TP_226063_20140719_20170421_01_T1 -54.15776 -3.289862 -5.392073
## 2 LC08_L1TP_226063_20140820_20170420_01_T1 -54.16858 -3.289828 -5.392054
## 3 LC08_L1GT_226063_20160114_20170405_01_T2 -54.16317 -3.289845 -5.392064
## 4 LC08_L1TP_226063_20160724_20170322_01_T1 -54.16317 -3.289845 -5.392064
## 5 LC08_L1TP_226063_20170609_20170616_01_T1 -54.17399 -3.289810 -5.392044
## 6 LC08_L1TP_226063_20170711_20170726_01_T1 -54.15506 -3.289870 -5.392083
## right datetime srs
## 1 -52.10338 2014-07-19T00:00:00 EPSG:32622
## 2 -52.11418 2014-08-20T00:00:00 EPSG:32622
## 3 -52.10878 2016-01-14T00:00:00 EPSG:32622
## 4 -52.10878 2016-07-24T00:00:00 EPSG:32622
## 5 -52.11958 2017-06-09T00:00:00 EPSG:32622
## 6 -52.09798 2017-07-11T00:00:00 EPSG:32622
## [ omitted 174 images ]
##
## Bands:
## name offset scale unit nodata image_count
## 1 AEROSOL 0 1 180
## 2 B01 0 1 -9999.000000 180
## 3 B02 0 1 -9999.000000 180
## 4 B03 0 1 -9999.000000 180
## 5 B04 0 1 -9999.000000 180
## 6 B05 0 1 -9999.000000 180
## 7 B06 0 1 -9999.000000 180
## 8 B07 0 1 -9999.000000 180
## 9 PIXEL_QA 0 1 180
## 10 RADSAT_QA 0 1 180
Creating data cubes
To create a regular raster data cube from the image collection, we
define the geometry of our target cube as a data cube view, using the
cube_view() function. We define a simple overview, covering the full
spatiotemporal extent of the imagery at 1km x 1km pixel size where one
data cube cell represents a duration of one year. The provided
resampling and aggregation methods are used to spatially reproject,
crop, and rescale individual images and combine pixel values from many
images within one year respectively. The raster_cube() function
returns a proxy object, i.e., it returns immediately without doing any
expensive computations.
v.overview = cube_view(extent=L8.col, dt="P1Y", dx=1000, dy=1000, srs="EPSG:3857",
aggregation = "median", resampling = "bilinear")
raster_cube(L8.col, v.overview)
## A data cube proxy object
##
## Dimensions:
## low high count pixel_size chunk_size
## t 2013-01-01 2019-12-31 7 P1Y 1
## y -764014.387686915 -205014.387686915 559 1000 192
## x -6582280.06164712 -5799280.06164712 783 1000 192
##
## Bands:
## name offset scale nodata unit
## 1 AEROSOL 0 1 NaN
## 2 B01 0 1 NaN
## 3 B02 0 1 NaN
## 4 B03 0 1 NaN
## 5 B04 0 1 NaN
## 6 B05 0 1 NaN
## 7 B06 0 1 NaN
## 8 B07 0 1 NaN
## 9 PIXEL_QA 0 1 NaN
## 10 RADSAT_QA 0 1 NaN
Processing data cubes
We can apply (and chain) operations on data cubes:
x = raster_cube(L8.col, v.overview) |>
select_bands(c("B02","B03","B04")) |>
reduce_time(c("median(B02)","median(B03)","median(B04)"))
x
## A data cube proxy object
##
## Dimensions:
## low high count pixel_size chunk_size
## t 2013-01-01 2019-12-31 1 P7Y 1
## y -764014.387686915 -205014.387686915 559 1000 192
## x -6582280.06164712 -5799280.06164712 783 1000 192
##
## Bands:
## name offset scale nodata unit
## 1 B02_median 0 1 NaN
## 2 B03_median 0 1 NaN
## 3 B04_median 0 1 NaN
plot(x, rgb=3:1, zlim=c(0,1200))
<!-- -->
library(RColorBrewer)
raster_cube(L8.col, v.overview) |>
select_bands(c("B04","B05")) |>
apply_pixel(c("(B05-B04)/(B05+B04)"), names="NDVI") |>
plot(zlim=c(0,1), nbreaks=10, col=brewer.pal(9, "YlGn"), key.pos=1)
<!-- -->
Calling data cube operations always returns proxy objects,
computations are started lazily when users call e.g. plot().
Animations
Multitemporal data cubes can be animated (thanks to the gifski package):
v.subarea.yearly = cube_view(extent=list(left=-6180000, right=-6080000, bottom=-550000, top=-450000,
t0="2014-01-01", t1="2018-12-31"), dt="P1Y", dx=50, dy=50,
srs="EPSG:3857", aggregation = "median", resampling = "bilinear")
raster_cube(L8.col, v.subarea.yearly) |>
select_bands(c("B02","B03","B04")) |>
animate(rgb=3:1,fps = 2, zlim=c(100,1000), width = 400,
height = 400, save_as = "man/figures/animation.gif")

Data cube export
Data cubes can be exported as single netCDF files with write_ncdf(),
or as a collection of (possibly cloud-optimized) GeoTIFF files with
write_tif(), where each time slice of the cube yields one GeoTIFF
file. Data cubes can also be converted to terra or starsobjects:
raster_cube(L8.col, v.overview) |>
