Ecmwfr
Interface to the public ECMWF API Web Services
Install / Use
/learn @bluegreen-labs/EcmwfrREADME
ecmwfr <img src="man/figures/logo.png" align="right" height="138.5"/>
Programmatic interface to the two European Centre for Medium-Range Weather Forecasts API services. The package provides easy access to all available Data Stores from within R, matching and expanding upon the ECMWF python tools. Support is provided for the Climate Data Store, the Atmosphere Data Store and the Early Warning Data Store (from the Copernicus Emergency Management Services).
How to cite this package
You can cite this package like this "we obtained data from the European Centre for Medium-Range Weather Forecasts API using the ecmwf R package (Hufkens, Stauffer, and Campitelli 2019)". Here is the full bibliographic reference to include in your reference list (don't forget to update the 'last accessed' date):
Hufkens, K., R. Stauffer, & E. Campitelli. (2019). ecmwfr: Programmatic interface to the two European Centre for Medium-Range Weather Forecasts API services. Zenodo. https://doi.org/10.5281/zenodo.2647531.
Installation
stable release
To install the current stable release use a CRAN repository:
install.packages("ecmwfr")
library("ecmwfr")
development release
To install the development releases of the package run the following commands:
if(!require(remotes)){install.packages("remotes")}
remotes::install_github("bluegreen-labs/ecmwfr")
library("ecmwfr")
Vignettes are not rendered by default, if you want to include additional documentation please use:
if(!require(remotes)){install.packages("remotes")}
remotes::install_github("bluegreen-labs/ecmwfr", build_vignettes = TRUE)
library("ecmwfr")
breaking changes (>= 2.0.0)
querying data
With the introduction of version 2.0.0 and the migration to the new API some
changes to the package were implemented which will cause breaking changes. In particular
the wf_request() function(s) now use a default ecmwfr user field due to the
consolidation of the API, with a single sign-on across all services and the use
of a Personal Access Token (PAT) rather than user and password credentials.
In order to migrate to version >=2.0.0 you will have to provide a new PAT using wf_set_key() and remove the user argument from any wf_request() call, i.e.:
# The original v1.x.x call
wf_request(
request,
user = "your_id"
)
# The new v2.x.x call
wf_request(
request
)
The requests themselves should translate mostly without intervention and remain nested lists of parameters.
netCDF data format
In comparison with the original services the new API regresses in terms of netCDF support. Those relying on common netCDF support such as ecosystem modellers will find this troubling. Note that both CDS and ADS have different policies and use different methods. Note that this regression in usability is not caused by this package. Please forward any issues you have with formatting of the data to the ECMWF using the public forum. We suggest to fall back to grib files, and convert internally if netCDF driver files are needed. Sadly, consistency for now is not, and will not be, guaranteed it seems.
Use:
Create a ECMWF account by self
registering. Once your user account has been verified you can get your API token (or key in ecmwfr) by visiting one of the Data Stores user profiles, for example the CDS user profile.
The API Token is a UUID and should look something like:
API: abcd1234-foo-bar-98765431-XXXXXXXXXX
This API Token gives you access to all Data Store services, including the climate atmosphere and emergency management services. This information is required to be able to retrieve data via the ecmwfr package. Use the
ecmwfr wf_set_key function to store
your login information in the system keyring (see below).
In order to download the data, you will also need to accept the licence agreement on the bottom of the user profile page.
Setup
Before using the package in R to download data you have to save your login
information. The package does not allow you to use your key inline in scripts
to limit security issues when sharing scripts on github or otherwise.
The following lines should NEVER be included in any script and run only once at setup.
# set a key to the keychain
wf_set_key(key = "abcd1234-foo-bar-98765431-XXXXXXXXXX")
# you can retrieve the key using
wf_get_key()
# the output should be the key you provided
# "abcd1234-foo-bar-98765431-XXXXXXXXXX"
# Alternatively you can input your login info with an interactive request
# if you do not put in the key directly
wf_set_key()
# you will get a command line request to provide the required details
Before you can download any data you have to make sure to accept the terms and conditions here: Before downloading and processing data from CDS please make sure you accept the terms and conditions in the profile pages of your Data Store of choice.
Data Requests
To download data use the wf_request
function, and a request string syntax. The simplest way to get the requests is
to go to the Data Store website which offers an interactive interface to create
these requests e.g., for the CDS ERA-5 reanalysis data:
After formatting the request online copy the API request python code to your script. The request should include the dataset, request and target field (if available).
Instead of json formatting as shown in the online form the ecmwfr package
uses a R lists for all the arguments. This makes changing variables less prone
to error, although overall we suggest not to manually create requests and use
the RStudio Addin to translate the python json request to R as shown below.
Just select the whole query, including the dataset and target fields, and click on the Addins > ECMWF Python to list. The original python query is listed below so you can try this routine yourself.
# The full python query, which you can translate to an R
# list using the Addin
dataset = "reanalysis-era5-pressure-levels"
request = {
'product_type': ['reanalysis'],
'variable': ['temperature'],
'year': ['2000'],
'month': ['04'],
'day': ['04'],
'time': ['00:00'],
'pressure_level': ['850'],
'data_format': 'netcdf',
'download_format': 'unarchived',
'area': [70, -20, 60, 30]
}
This will give you a request as an annotated list. If no target file is
specified in the original request a target field will be added to the list
with a default name TMPFILE. Replace this filename with something that matches
your preference and the specified data format. In this case the default name
was changed to era5-demo.nc, a netcdf file. This formatted request can now be
used by the wf_request function to query and download the
data. By default the process is verbose, and will give you plenty of feedback
on progress.
# This is an example of a request as converted from
request <- list(
dataset_short_name = "reanalysis-era5-pressure-levels",
product_type = "reanalysis",
variable = "temperature",
year = "2000",
month = "04",
day = "04",
time = "00:00",
pressure_level = "850",
data_format = "netcdf",
download_format = "unarchived",
area = c(70, -20, 60, 30),
target = "era5-demo.nc"
)
# If you have stored your user login information
# in the keyring by calling cds_set_key you can
# call:
file <- wf_request(
request = request, # the request
transfer = TRUE, # download the file
path = "." # store data in current working directory
)
The Data Store services are quite fast, however, if you request a lot of
variables, multiple levels, and data over several years these requests
might take quite a while! You can check the scope of your query and if
it is out of bounds in the right hand Request Validation panel when
formatting your original data request in the web interface.
Note: If you need to download larger amounts of data it is suggested to split the downloads, e.g., download the data in chunks (e.g., month-by-month, or year-by-year). A progress indicator will keep you informed on the status of your request. Keep in mind that all data downloaded will be buffered in memory limiting the download
