Rsdmx
Tools for reading SDMX data and metadata in R
Install / Use
/learn @eblondel/RsdmxREADME
rsdmx <a href="https://github.com/eblondel/rsdmx"><img src='https://github.com/eblondel/rsdmx/blob/master/doc/rsdmx.png?raw=true' align="right" height="139" /></a>
** Tools for reading SDMX data and metadata documents in R **
Overview
rsdmx is a package to parse/read SDMX data and metadata in R. It provides:
- a set of classes and methods to read data and metadata documents exchanged through the Statistical Data and Metadata Exchange (SDMX) framework. The package currently focuses on the SDMX XML standard format (SDMX-ML).
- an interface to SDMX web-services for a list of well-known data providers, such as EUROSTAT, OECD, and others Learn more.
Citation
We thank in advance people that use rsdmx for citing it in their work / publication(s). For this, please use the citation provided at this link
Collating scattered SDMX data sources
In spite they are some R package initiatives relying on rsdmx that aim to provide a wrapper for a single data source (e.g. OECD, EUROSTAT), it is strongly recommended to rely directly on rsdmx. Indeed, one main objective of rsdmx is to promote and facilitate collating scattered data from a growing number of SDMX data providers, whatever the organization.
It is already possible to query well-known datasources, using the embedded helpers. Pull requests are welcome to support additional data providers by default in rsdmx.
SDMX standards compliance
Status
At now, the package allows to read:
- Datasets (
GenericData,CompactData,StructureSpecificData,StructureSpecificTimeSeriesData,CrossSectionalData,UtilityDataandMessageGroupSDMX-ML types) - Concepts (
Concept,ConceptSchemeandConceptsSDMX-ML types) - Codelists (
Code,CodelistandCodelistsSDMX-ML types) - DataStructures / KeyFamilies - with all subtypes
- Data Structure Definitions (DSDs) - with all subtypes
Fundings
rsdmx is looking for sponsors. You have been using rsdmx and you wish to support its development? Please help us to make the package growing!
Author
Copyright (C) 2014 Emmanuel Blondel
Contributors
- Matthieu Stigler
- Eric Persson
Distribution
on CRAN
rsdmx is available on the Comprehensive R Archive Network (CRAN). See the R CRAN check results at: https://cran.r-project.org/web/checks/check_results_rsdmx.html
Please note that following a new submission to CRAN, or eventually a modification of CRAN policies, the package might be temporarily archived, and removed from CRAN. In case you notice that the package is not back in few time, please contact me.
on R-Universe
rsdmx is available on the R-Universe public cloud server. The package version corresponds to the ongoing revision (master branch in Github). See https://eblondel.r-universe.dev/#package:rsdmx
Quickstart
rsdmx offers a low-level set of tools to read data and metadata in SDMX format. Its strategy is to make it very easy for the user. For this, a unique function named readSDMX has to be used, whatever it is a data or metadata document, or if it is local or remote datasource.
It is important to highlight that one of the major benefits of rsdmx is to focus first on the SDMX format specifications (acting as format abstraction library). This allows rsdmx reading SDMX data from remote datasources, or from local SDMX files. For accessing remote datasources, it also means that rsdmx does not bound to SDMX service specifications, and can read a wider ranger of datasources.
Install rsdmx
rsdmx can be installed from CRAN
install.packages("rsdmx")
or from its development repository hosted in Github (using the devtools package):
devtools::install_github("eblondel/rsdmx")
Load rsdmx
To load rsdmx in R, do the following:
library(rsdmx)
readSDMX & helper functions
readSDMX as low-level function
The readSDMX function is then first designed at low-level so it can take as parameters a url (isURL=TRUE by default) or a file. So wherever is located the SDMX document, readSDMX will allow you to read it, as follows:
#read a remote file
sdmx <- readSDMX(file = "someUrl")
#read a local file
sdmx <- readSDMX(file = "somelocalfile", isURL = FALSE)
In addition, in order to facilitate querying datasources, readSDMX also providers helpers to query well-known remote datasources. This allows not to specify the entire URL, but rather specify a simple provider ID, and the different parameters to build a SDMX query (e.g. for a dataset query: operation, key, filter, startPeriod and endPeriod).
This is made possible as a list of SDMX service providers is embedded within rsdmx, and such list provides all the information required for readSDMX to build the SDMX request (url) before accessing the datasource.
get list of SDMX service providers
The list of known SDMX service providers can be queried as follows:
providers <- getSDMXServiceProviders()
as.data.frame(providers)
create/add a SDMX service provider
It also also possible to create and add a new SDMX service providers in this list (so readSDMX can be aware of it). A provider can be created with the SDMXServiceProvider, and is made of various parameters:
agencyId(provider identifier)namescale(international or national)countryISO 3-alpha code (if national)builder
The request builder can be created with SDMXRequestBuilder which takes various arguments:
regUrl: URL of the service registry endpointrepoUrl: URL of the service repository endpoint (Note that we use 2 different arguments for registry and repository endpoints, since some providers use different URLs, but in most cases those are identical)formatterlist of functions to format the request params (one function per type of resource, e.g. "dataflow", "datastructure", "data")handlerlist of functions which will allow to build the web request *compliantlogical parameter (either the request builder is compliant with some web-service specifications)
rsdmx yet provides common builders, that can be customized if needed, by overriding
either the formatter or the handler functions:
SDMXREST20RequestBuilder: connector for SDMX REST 2.0 web-servicesSDMXREST21RequestBuilder: connector for SDMX REST 2.1 web-servicesSDMXDotStatRequestBuilder: connector for SDMX .Stat ("DotStat") web-services implementations
Let's see it with an example:
First create a request builder for our provider:
myBuilder <- SDMXRequestBuilder(
regUrl = "http://www.myorg.org/sdmx/registry",
repoUrl = "http://www.myorg.org/sdmx/repository",
formatter = list(
dataflow = function(obj){
#format each dataflow id with some prefix
obj@resourceId <- paste0("df_",obj@resourceId)
return(obj)
},
datastructure = function(obj){
#do nothing
return(obj)
},
data = function(obj){
#format each dataset id with some prefix
obj@flowRef <- paste0("data_",obj@flowRef)
return(obj)
}
),
handler = list(
dataflow = function(obj){
req <- sprintf("%s/dataflow",obj@regUrl)
return(req)
},
datastructure = function(obj){
req <- sprintf("%s/datastructure",obj@regUrl)
return(req)
},
data = function(obj){
req <- sprintf("%s/data",obj@regUrl)
return(req)
}
),
compliant = FALSE
)
As you can see, we built a custom SDMXRequestBuilder that will be able to
create SDMX web-requests for the different resources of a SDMX web-service.
We can create a provider with the above request builder, and add it to the list of known SDMX service providers:
#create the provider
provider <- SDMXServiceProvider(
agencyId = "MYORG",
name = "My Organization",
builder = myBuilder
)
#add it to the list
addSDMXServiceProvider(provider)
#check provider has been added
as.data.frame(getSDMXServiceProviders())
find a SDMX service provider
A another helper allows you to interrogate rsdmx if a specific provider is
known, given an id:
o
